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ABSTRACT 

The report describes improved algorithms within a 
computer program for identifying spelling and word order errors in 
student responses. A "markup analysis" compares a student's response 
string to an author-specified model string and generates a graphical 
error markup that indicates spelling, capitalization, and accent 
errors, extra or missing words, and out-of-order words. The algorithm 
determines whether the response was acceptable or not, and computes a 
string of graphical error marks to be displayed below the student 
response. Synonyms and ignorable words can be specified and spelling 
errors, extra words, and word order errors can be accepted at the 
author's discretion. Spelling analysis is done using a dynamic 
programming algorithm that produces a least-cost edit trace; word 
order analysis is implemented using recursive branch and bound 
search. Improvements on earlier versions of the algorithm give more 
intuitive markup values. The algorithm is implemented as a HyperTalk 
XFCN. HyperTalk scripts can provide numerous input parameters that 
control the details of the matching process, and the algorithm 
returns a variety of fit measurements that characterize the match. 
Non-roman linear writing scripts are supported. The report contains 
detailed information on use of the function and serves as a user 
manual. Contains two references. (Author/MSE) 
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ABSTRACT 



This report describes improved algorithms for identifying spelling and word order errors in studenf. 
responses. A markup analysis compares a student's response string to an author specified model 
string and generates a graphical error markup that indicates spelling, capitalization and accent 
errors, extra or missing words, and out-of-order words. The algorithm determines whether the 
response was acceptable or not, and computes a string of graphical error marks which can be 
displayed below the students response as error feedback. Synonyms and ignorable words can be 
specified within the model, and spelling error, extra words and word order errors can be accepted at 
the author's discresion. Spelling analysis is done using a dynamic programming alogorithm which 
produces a least-cost edit trace; word order analysis is implemented using recursive branch and 
bound search. Improvements on earlier versions of the algorithm give a more intuitive markup 
values. The algoritm is implemented as a HyperTalk XFCN. HyperTalk scripts can provide 
numerous input parameters that control the details of the matching process, and the algorithm 
returns a variety of fit measurements that characterize the match. Non-roman linear writing 
systems are supported. The report contains detailed information on use of the function and serves 
as a user manual. 
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INTRODUC/ION 



The program described in this report implements improved versions of the algorithms for analyzing and 
marking word order errors in student responses first described in Hart (1988). Several minor difficulties in 
the previous version have been corrected, a number of new features have been added, and the whole utility 
has been reimplemented as a HyperCard XFCN (external function) to make it is available in a Macintosh 
environment. 

The basic functionality of the Markup XFCN is to accept a model string (in educational applications, 
usually a "correct answer" specified by the courseware author), a response typed in by the student, and 
generate a graphic markup that indicates where the response is incorrect. Here is an example: 

Model: The very cn.iick brown fox jumped over the big lazy dog 

Response: thevery qick pronw foxx oar the lazy big dog. 

Markup: - [ \ = >< xAxxx A « ~ (1) 

The markup symbols displayed beneath the (incorrect) response indicate the editorial changes needed to make 
the response match the model. The symbol " \ " indicates one or more omitted letters; "x" an extra letter, 
and "»" an incorrectly substituted letter. The symbol pair "><" indicates transposition of two letters; "x" 
means an extra word (one that should not be in the response, or else one that is so badly misspelled that it 
can't be recognized); "A" indicates that one or more words is missing at that point in the response; and "«" 
means that the word is part of the response, but in the wrong position - it should be moved leftward to one 
of the "A" symbols. Incorrect capitalization is symbolized by "_" and an accent error by 

Spelling analysis is done by a dynamic programming algorithm which generates a markup corresponding 
to a least-cost editing trace. Editing operations are restricted to omission of a letter, insertion of an extra 
letter, substitution of one letter for another, or transposition of two adjacent letters. Capitalization and 
accent errors are also identified and marked. The user may specify the way in which capitalization 
disagreements will be treated: exact agreement required, upper case required wherever the model has 
uppercase, or case differences ignored. Rurt-together words are identified as such if they are adjacent in the 
model. Some misspelling is tolerated in one or both run-togethers. 

Order analysis identifies extra words, missing words, and misplaced words, The user can specify various 
degrees of tolerance when defining what constitutes a match with the model: spelling errors can be excused; 
incorrect word order can be excused, and extra words in the response can be excused. The order analysis 
returns three goodness of fit measures: proportion of matched words, proportion of words in correct order, 
and average amount of misspelling per matched word. 

When specifying the model, the author is allowed to specify one or more words which will be ignored if 
they occur in the student's response. Such a word or list must be surrounded by angle brackets, <>. A list 
of "synonyms" (i. a set of words any one of which would be correct at a given position in a sentence) 
must be surrounded by square brackets, [ ]. 

The basic theory underlying the word order and spelling analysis and markup, which was presented in Hart 
(1989), has not changed and the reader should consult that document for a detailed exposition of the 
approach used. This report is restricted to two goals: (a) describing the changes in the algorithms made in 
version 2.0, and (b) giving a detailed account of the HyperCard interface so that lesson developers can easily 
incorporate the markup facility into HyperCard stacks. Appendix 1, however, does present a complete 
listing of this version of the MarkUp XFCN. 



CORRECTIONS IN VERSION 2.0 OF THE MARKUP ALGORITHM 



In certain cases, version 1.0 of the MarkUp algorithm returned a markup suing which, though technically 
correct, was counter-intuitive. Once example is: 

Model: He lives in Chicago 

Response: He lives in in Chicago. 

Markup: ~ xx (2) 

Here, it would agree better with common sense if the mis-accented "fn" were marked as the extra word rather 
than the adjacent, and perfectly correct, "in": 

Response: He lives in in Chicago. 

Better Markup: xx (3) 

The reason for the poor markup in (2) is simple. In determining the edit distance between pairs of words, 
version 1.0 of the MarkUp XFCN simply ignored any capitalization or accent errors. Word similarity was 
determined without any reference to these matters, and since spelling and word order analysis built on 
similarity information, they also ignored such errors. Only at the very end, when displaying the graphic 
markup did the algorithm check for such problems. Thus, as far as the word order routine is concerned, 
these two responses are identical: 

Responsel: He lives in Chicago (4a) 

Response2: He lives fn Chicago (4b) 

In cases like (2), where there are redundant identical words, the MarkUp XFCN always uses the leftmost 
possibility. 

The fix for this problem was straightforward. Two new weight parameters have been introduced into the 
program, wcap (the weight of a capitalization error) and waccent (the weight of an accent error). Since 
capitalization and accent errors are normally perceived as relatively minor problems, we do not want them to 
have much influence on edit distance, so their default values have been made much smaller than the 
remaining weights. The current default weights are: 



wdelete 20 

winsert 2 20 

W3ubstitute 30 

wtranspose 20 

wcap 1 

waccent 1 (5) 



This means that "fn' and "in" are now at an edit distance of 1 from one another, rather than 0 as previously. 
Note also that the "average" distance associated with a single edit operation is now 

((wdelete + winsert + wsubstitute + wtranspose)/4) + ((wcap + waccent)/2) (6) 

This is appropriate because deietion, insertion, substitution, and transposition are mutually exclusive 
operations, but a cap or accent error may or may not accompany a substitution. 

A related situation where version 1.0 of the MarkUp XFCN fails to perform satisfactorily is illustrated in 
this example: 
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Model: the time 

Response: then the time. 

Markup: x xxx (7) 

Here the problem is not due to accents or capitalization errors, but to the way that response words were 
paired with model words. When considering candidates for pairings, the procedure ignored the exact 
magnitude of edit distance between word pairs. Instead, a cutoff criterion was used. If the edit distance 
exceeded the cutoff valise, the pair were considered as a potential match; otherwise, they were considered to 
be distinct words which could not be paired under any circumstances. Thus, in (7) the response words 
"then" and "the" are considered to be equally good matches for the model word "the", and "then" is selected 
because it happens to be leftmost, even though this entails a spelling error. 

A third case of inappropriate markup is 

Model: seen on a boat in Chicago 

Response: seen in a boat in Chicago 

Markup: a « « xx (8) 

In this case the first "A" marks the location of the missing words "on a boat", consequently "on" must be 
inserted at the location of the "A", and the words "a boat" are marked with "«" to show that them must be 
moved leftward from their current location to the location of the "A". This wili leave the word "in" (the one 
which has not been marked as extra) adjacent to "Chicago", to make up the phrase "in Chicago". The 
second "in" is unneeded and is marked as extra. While technically correct, this markup is unintuitive and, if 
fact, very confusing. Most readers would agree that the student simply substituted "in" for "on" in her 
response, so the appropriate markup would be 

Response: seen in a boat in Chicago 

Markup: xx (9) 

The strange markup in (8) occurs because of the way the word order analysis operated in version 1 .0, 
Matching proceeded in two steps. First response words were paired with candidate model words in such a 
way that the number of inversions is minimized; that is, the criterion for doing the matching is to keep the 
order of words in the response as close as possible to their order in the model. In (8), the candidates were 

Response word: 1 2 3 4 5 6 



Model word candidates: 1 5 3 4 5 6 (10) 

Notice that model word 5, "in", is the only available candidate for pairing with response word 2 and also 
response word 5, both of which are "in". (Model word 2, "on", is not a candidate for pairing with anything 
because its normalized edit distance from every response word exceeds the cutoff threshold.) In the initial 
phase of matching, response words are paired one at a time, proceeding from left to right, When response 
position 2 is considered, the only available candidate is model word 5. A response word is not allowed to 
remain unpaired if there is any candidate that will match it, so the pairing (2, 5) will be created. This 
means that when we reach response word 5, there is nothing left to pair it with, so it is tagged as an extra 
word. 

To remedy such problems, the word order analysis had a second stage, embodied in the 
adjust_8olution procedure, which attempted to improve the quality of the match by taking into 
account the actual magnitude of edit distance. This procedure looked at certain pairs of matches to see if 
exchanging the match between pairs would reduce the overall edit distance without increasing the number of 
inversions. However, attention was restricted to unmatched response words (namely, those matched with 
the "null" model word), and the algorithm merely checked to see if the overall match could be improved by 
taking a model word away from some other response word and pairing it with a currently unmatched one. 
The algorithm was roughly as follows, where M, M' denote words in the model ; R, R', words in the 
response, and R <-> M indicates a pairing of R with M: 
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FOR each matched response word R DO 

FOR each unmatched response word R' to the right of R DO 
BEGIN 

M := the word paired with R; 

IF edit distance of (M, R') is less than the edit distance of (M, R) AND 
(pairing M with R' leads to no more inversions than pairing M with R) 
THEN 

BEGIN 

Re-pair M with R'; 
Re-pair null with R 

END 

END (11) 

Although this adjustment cleared up many deficiencies in the match, it was not sufficient in general and 
still allowed markups like (8). To eliminate these shortcomings, version 2.0 of the ad just_solution 
algorithm has been completely rewritten and made much more general. Now all possible pairs of matches, 
R <-> M and R' <-> M\ are considered, and if exchanging the pairing so that R <-> M' and R' <-> M 
results in a lesser overall edit distance without increasing the number of inversions, or decreases the 
inversion count without increasing the edit distance, then the solution is modified to incorporate the 
exchanged match, essentially as indicated in the following algorithm: 

FOR each response word R DO 

FOR each response word R DO 
BEGIN 

M := the model word matched with R; 
M' := the model word matched with R'; 
oldEditD := edit distance of (M, R) + edit distance of (M\ R'); 
newEditD := edit distance of (M, R') + edit distance of (M', R); 
oldlnvK := number of inversions in original solution; 
newInvK := number of inversion when R <-> M' and R' <-> M; 
IF (newEdit D < oldEdit D AND new InvK <= oldlnvK) OR 
(newEditD = oldEditD AND newInvK < oldlnvK) THEN 
BEGIN 

Re-pair R with M'; 
Re-pair R' with M 

END 

END (12) 

The revised algorithm considers many more potential exchanges, and thus improves the overall match in 
situations where the earlier version failed to do so. 

A complete listing of version 2.0 of the MarkUp XFCN, including the adjust_solution procedure, 
appears in Appendix 1. 



USE OF CHARACTER CATEGORY INFORMATION FOR SPELLING ANALYSIS 



Version 1.0 of the MarkUp XPCN specified the phonetic category of each character (whether it was a vowel 
or consonant), and allowed the user to modify such information, but made no use of it. In version 2.0, 
phonetic information is utilized during the process of spelling analysis to achieve a more psychologically 
meaningful measure of edit distance. 

The problem is the weight which should be attached to a substitution error, that is, an error which results 
because some character M in a model word has been replaced by a different character R in the student's 
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response. Version 1.0 assigned the same weight (namely, the value of the variable wchange /hich 
defaults to 30) regardless of the actual identities of M and R. Consequently, "readable" has the same edit 
distance from "readible" and "reidablc" as it does from "readxble", "rezdible", "oedable", and "readaule". 
Intuitively, however, the latter substitutions are less likely to occur, at least for students who are careful 
typists but poor spellers -- a description which applies, for example, to many student language learners. 
The reason is that spelling errors usually involve substituting one vowel for another or one consonant for 
another, but only rarely a consonant for a vowel or visa versa Of course typing errors, being a function of 
keyboard position and n-gram frequency (Rumclhart & Norman, 1982), do not necessarily show this 
pattern. We need a way of assigning differential substitution costs to different pairs of character categories. 

The problem of determining accurate substitution probabilities for character pairs (or character category 
pairs) can be solved only by some combination of empirical data and psychological modelling. However, 
there are situations where even rough estimates will be useful. Consider the case of Japanese writing, 
which utilizes three basic categories of character: katakana, hirigana, and kanji. In the computer 
representations now becoming standard, all of these characters are represented as 32-bit character codes, and 
MarkUp will treat each as if it were a single character. But this psychologically inaccurate. Since the 
hiragana character represent CV syllables, a beginner will be relatively likely to confuse them with one 
another. The substitution of a hiragana character for a kanji within a word should be a relatively rare event, 
and thus have a high edit distance attached to it. Kanji do have internal structure, and thus might be 
substituted, one for another,, with one another with varying degrees of probability. However, the character 
code of a Kanji does not reveal its internal structure sufficiently so that MarkUp can determine similarity. 
Thus, the cost of every kanji-kanji substitution must be the same. In fact, to prevent every word in the 
model from being identified with every word in the response, such substitutions must be forbidden (infinite 
cost). We thus need at least two categories of character, hirigana and kanji. Hirigana-hirigana substitution 
will be permitted at a moderate cost, but kanj-kanji and kanji-hirigana substitution will be excluded. (This 
example is theoretical, because version 2.0 of MarkUp does not support 32-bit characters.) 

Version 2.0 of MarkUp supports the assignment of differential substitution costs by providing five different 
"phonetic" categories. Actually, a better term would be character categories, because they need not actually 
concern phonetic properties of the character, and because the categories are mutually exclusive and 
exhaustive -- each character must belong to exacUy one category. The categories have the hard-coded names 

vowel, consonant, phon3, phon4, phon5 

These labels are purely conventional however, the lesson author can redefine the categories in any way she 
wishes. The default phonetic information assigns a, e, i, o, u, and y to the "vowel" category and all other 
characters to the "consonant" category; as a result, the three remaining categories "phon3", "phon4" and 
"phon5" are not used at all. 

To make use of these categories, a 5x5 matrix of weights called phon_matrix is created and intialized with 
these default values: 



Response Char Category 



Model 
Char 

Category 


vowel 


consonant 


phon3 


phon4 


phon 5 


vowel 


30 


36 


36 


36 


36 


con sonant 


36 


30 


36 


36 


36 


phon3 


30 


36 


36 


36 


36 


phon<3 


30 


36 


36 


36 


3 6 


phon6 


30 


36 


36 


36 


36 



Each row M corresponds to a possible category of the model character, and each column R to a possible 
category of a response character. The cell phon_matrix[M, R] gives the cost of replacing a model character 
of category M by a response character of category R. For instance phon_matrix[vowel, consonant] is 36, 
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the cost of subsisting a consonant for a vowel. Given the default partition of characters between vowel and 
consonant, the matrix entries indexed by phon3, phon4, and phon5 are redundant, because they will never be 
referred to during the analysis. 

The default weights are assigned according to the following rule: intra-category substitutions (vowel-vowel, 
consonant-consonant, and other cells along the; diagonal) are assigned the value of the parameter wchange, 
which has a the default value 30. Inter-category substitutions (off -diagonal cells) are assigned the value 
(1 . 2* wchange) ~ 36. The factor 1.2 is arbitrary, but was chosen so that vowel -consonant substitutions 
would be more expensive than vowel-vowel or consonant-consonant, and yet not so expensive that they 
would prevent words containing typos from being identified as potential matches. The user can modify 
both character category assignments and the phon_matrix values, as explained below. 



USING THE MARKUP XFCN 



The MarkUp XFCN is implemented as a Macintosh code resource, so it must be made available to your 
stack before you use it. This can be done in several ways: (1) copy it directly into the stack resource fork 
with the RESCOPY or RESEDIT utilities; (2) copy it into your HOME stack rcsuorce fork jsing the same 
utilities; or (3) execute a START USING STACK command to attach a stack which already contains the 
MarkUp XFCN as a resource. Copying directly to your own stack is more stable, but also wasteful of 
space if copies proliferate. 

Once the MarkUp XFCN has been made available, it can be called like any other HyperCard function. It 
enables you to produce a graphic error markup in a HyperCard stack. When you call MarkUP, you must 
input a model string and a response string. MarkUp will match the two and return a markup string, as well 
as other information about the quality of the match. You can then display this information to the student, 
or use it in any other way you chose. 

As one step in generating the graphical error markup, the MarkUp XFCN judges the response ((i. e., 
evaluates it for correctness). You can control the amount and nature of error tolerance during this judging 
process by changing the values of various input parameters to MarkUp. To be specific, you can specify 
synonyms for various words in the response. You can specify words which should be ignored if they occur 
in the response. And you can stipulate that the response will be judged correct even if it contains spelling 
errors, or word order errors, or extra words. 

Evaluating and marking up a user's response is a complex activity which can be modified in various ways 
to reflect various kinds of content and instructional needs. You can control the way in which MarkUp 
operates by setting input parameters. Thirteen of these parameters can be set by passing values directly to 
the MarkUp XFCN when it is called. More esoteric aspects of operation can be controlled by putting 
suitable values into five global HyperCard variables: theMarkUpWeights, theMarkUpSymbols, 
theMarkUpCharlnfo and theMarkUpDebug. 

The direct return of the Markup XFCN is a markup string. This is simply a sequence of symbols in 
character string format, like that in (1), which indicates the nature of the mismatches between the model and 
the student's response. Additional information may be returned in the four HyperCard global variables 
theMarkUpReturnValues, theMarkUpMaps, t heMa r kUpP a r amD i sp 1 ay , and 
theMarkUpDebug. 
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CALLING THE MARKUP XFCN 



The syntax for calling the MarkUp XFCN allows for up to 14 input parameters, however 12 of these are 
optional and need not be specified except in special situations. The general form of a MarkUp function call 
is 



markup! model, 

response, 

capFlag, 

extraWordsOk, 

anyOrderOk, 

misspellOk, 

wordMarkUpNeeded, 

runTogetherNeeded, 

ad justNeeded, 

shortcut, 

markUpMapsNeeded, 

parameterDisplayNeeded 

spel lingOn lyNeeded 

debugNeeded ) (13) 



The meaning of each of the thirteen input parameter slots, and the range of values acceptable in that slot, is 
as follows: 



model 

response 

capFlag 



extraWordsOk 
anyOrderOk 

misspellOk 
wordMarkUpNeeded 



runtogetherNeeded 



String or container specifying the correct response. 

String or container holding the student's response. 

If "exact_case" (the default) then the capitalization in the 
response must exactly match that in the model or else cap 
errors will be marked. If "authors_caps", the response 
must have a capital whenever the model does, but additional 
capitals in the response are permitted. If " ignore_case " 
then case is ignored when matching model and response. 

If True, judge OK even if extra words are present in the 
response. If False (the default) judge NO if extra words are 
present. 

If True, order of words in the response does not have to 
match the order of words in the model in order to get an OK 
judgment. If False (the default), judge NO if words are not 
in the specified order. 

If True, judge OK even if some words are misspelled. If 
False (the default), judge NO if there is any spelling error. 

If True, an error markup string will be generated and returned. 
If False (the default), no string (i. e., a null string) will be 
returned, only a judgment of OK or NO. If you simply wish 
an evaluation and don't want to display the graphic markup as 
error feedback, you can speed things up slightly by setting this 
parameter to False. In that case, your script can use the other 
information returned by MarkUp to determine what feedback to 
give the student. 

If True (the default), MarkUp will Find and mark run-together 
words. If False, run-togethers will not be identified as such, 
but will be marked as misspelled or unidentified words. 



Turning off this feature when MarkUp is running slowly will 
speed things up, but at the cost of degrading the quality of the 
markup. 

adjustNeeded If True (the default), MarkUp will try to "improve" the 

graphical error markup to make it more intuitive. If False, 
this improvement is not done. Do not turn off improvement 
unless speed is a serious problem, because it significantly 
degrades the quality of the MarkUp 

shortcut If True (the default), MarkUp will do a "fast" spelling 

analysis that will not generate a spelling markup between 
badly misspelled pairs. If False, force a complete spelling 
analysis for every word. Use False if you need a markup for 
very badly misspelled words (e. g., when using MarkUp in a 
spelling lesson). Turning off shortCut may slow the program 
down significantly when model and/or response are long. 

markUpMapsNeeded If True, MarkUp will generate and return in the HyperCard 

global variable theMarkUpMaps two "maps" showing 
which model words are paired with which response words. If 
False (the default), this map will not be returned, and the 
value of theMarkUpMaps remains unchanged. 

parameterDisplayNeeded One of the characters " v " , " b " , " d " , " c " . " h " , " p " , " w " , 

" f " , " s " , or " m" or else nothing at all (the default). If one 
if these characters is present, then information of the requested 
type will be returned in the HyperCard global variable 
theMarkUpParamDisplay. Otherwise the value of 
theMarkUpParamDisplay remains unchanged. The 
character that you use as an input parameter determines the 
kind of information that will be returned: 

"v" VERSION of the MarkUp XFCN which is running 

"b" Table of BASE CHARACTER specifications 

" d" Table of DIACRITIC specifications 

"c" Table of CASE specifications 

"h" Table of PHONETIC CATEGORY specifications 



"p" Table of PUNCTUATION CHARACTER 
specifications 

"w" Values of the JUDGING WEIGHTS AND 
THRESHOLDS 

" f " Values of the JUDGING FLAGS 

"s" Values of the MARKUP SYMBOLS 

"m" Values in the PHON_MATRIX 

This parameter allows you to copy a judging table into a 
HyperCard container, where it can be inspected using the 
SHOW VARIABLES option of the HyperCard debugger. The 
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format in which this information is returned is discussed 
below. 



spellingOnlyNeeded If no value or "x" (the default), then the standard spelling and 

word order analysis is done. If the value is " r " or "p " , a 
special, spelling-only analysis will be done: the Model and 
Response strings will be immediately subm itted verbatim to 
the spelling analyzer and an edit trace will be generated by 
compairing every character in the two strings, including 
punctuation, spaces, and return characters. None of the special 
syntax used to define synonym and ignorable word lists in the 
model will be recognized. Since there are no word boundaries, 
no order analysis will be done. The value of 
spellingOnlyNeeded determines the nature of the return: 

"p" Return a "pretty" markup string, suitable for display 
beneath the response string. 

" r " Return the raw markup string, without prettying it 
up. 

"x" Do not do the special spelling-only analysis; do the 
normal spelling and word order analysis. 

Since the Model and Response strings are treated as if they 
were words when spellingOnlyNeeded is "r" or "p" neither 
string can exceed the maximum word length of 20 characters. 

The information returned in the HyperCard global variable 
theMarkUpReturnValues are different for the special 
analysis, and consists of a raw edit distance and a normalized 
edit distance. 



Returning a raw trace forces a least-cost edit trace string 
(markup string) to be computed no matter how dissimilar the 
Model and Response are, so this option is useful for spelling 
lessons or other cases where an exact spelling markup is 
needed even when a response is badly misspelled. Only the 
"pr<r::y" markup will display properly, but it has incomplete 
information about the nature of errors present, so the "r" 
option is appropriate if you want to do computations on the 
markup string. 

debugNeeded Setting this parameter to "True" cause technical information 

about the internal workings of MarkUp to be returned in the 
HyperCard global theMarkUpDebug. Included are the edit 
distance matrix, values of ignorable words in the model and 
response, and candidate match sets for each response word. 
This information is intended only for debugging and 
development purposes. If False (the default), no information 
is returned. 



The Response and Model strings must be specified when you call the MarkUp XFCN. The remaining 12 
parameters are optional. If you are satisfied with the default value of a parameter, simply leave that slot 
empty (of course, a comma must be present to mark the location of the unused slot if other parameter 
values follow). If the unused parameters arc dangling (i. e., come after the last real parameter value), the 
commas may also be omitted, following the usual HyperCard convention for input parameters. 
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values follow). If the unused parameters are dangling (i. e., come after the last real parameter value), the 
commas may also be omitted, following the usual HyperCard convention for input parameters. 

IMPORTANT: Each of the 14 input parameters reverts to the default value after each call to MarkUp, so 
non-default values must be respecified each time you call MarkUp. 

HyperCard evaluates each parameter before sending it to the MarkUp XFCN, so parameters may be specified 
by any HyperCard expression, including constants, variables, chunk specifiers, or field specifiers. 



SPECIFYING THE MODEL AND RESPONSE PARMETERS 



The simplest form of correct answer is a single word or string of words: 

The quick brown fox jumped over the lazy dog . (14) 

The model should not contain any characters which are currently defined as punctuation marks, because such 
characters are removed from the student's response string before it is judged. If such characters appear in the 
model string, it will be impossible for the response to match the model. 

The square brackets "[ ]" and the angle brackets "< >" have special uses in die model string. Square 
brackets are used to specify a list of (one or more) synonymous words. The words must be separted by one 
or more spaces; ther punctuation is not acceptable. The words do not have to be synonyms in the usual 
sense; in fact any collection of words can be put into a synonym list Such a list simply specifies that any 
member of the list will be acceptable at that point in the model, as in 

The (quick fast speedyl brown fox jumped over the ilazy stupid I doq (15) 

Any word in the list will be acceptable at that position in the response. Thus the model shown will result 
in the following markups 



Response: The quick brown fox jumped over the lazy dog. OK (16a) 

Markup: (none) 

Response: The speedy brown fox jumped over the stupid doq. OK (16b) 

Markup: (none) 

Response The brown speedy fox jumped the lazy dog over. NO (16c) 

MarkUp: A « A « 



Angle brackets specify a list of (one or more) ignorable words. 

<the a> brown fox jumped over [lazy stupid] dog (17) 

There may be several lists of ignorable words, which may appear anywhere in the response, but the effect 
will always be the same as a single list of ignorables at the front of the model. Any response word which 
matches any of the ignorable words "well enough" will simply be treated as if it were not present in the 
response. "Well enough" is defined to permit captialization and accent errors, but no other kinds of spelling 
errors. Thus, if (17) is used as a model, the following responses will all be judged correct: 

Response: A brown fox jumped over the stupid dog. (18a) 

Response: The brown fox jumped over the lazy dog. (I8b) 
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Version 2.0 of the MarkUp XFCN places some limitations on both model and response string: 

The model and response strings must each be 255 characters or less (or 22 characters, 
if you have selected the SpcllingOnly analysis). 

No single word in the model or response may be more than 22 characters (punctuation and spaces 
do not count as pan of a word) 

Neither model nor response may contain more than 18 words. Each entry in an ignorablc word 
list or synonym list counts as a word. 

Exceeding these limits will cause MarkUp to abort the judging process and return an error string which 
defines the nature of the error. 

These limits are hard-coded in PASCAL as global constants, and can be changed by recompiling the 
PASCAL source code. They are imposed by the fact that Version 2.0 of MarkUp defines its large data 
structures as static arrays within PASCAL. Since MarkUp XFCN runs under HyperCard and has to borrow 
its space from HyperCard, increasing the limits above causes the HyperCard stack to overflow into the heap 
and immediately terminates HyperCard with system error 28 (stack has moved into application heap). 



ADDITIONAL PARAMETERS CONTROLLING THE MARKUP PROCESS 



The more technical aspects of the markup analysis can be controlled by changing the values of the five 
HyperCard global variables t h eMa r kUpPunc t ua t i on, t heMa r krjp S ymbo 1 s , 
theMarkUpWeights, theMarkUpPhonMatrix and theMarkUpChar Info. You can change the 
values of these variables by using the HyperCard PUT command, and can inspect their current values by 
using the HyperCard debugger's SHOW VARIABLE.option. It is unlikely that you will have reason to 
change these variables, but specialized judging situations sometimes require it. 

Each of these globals expects a list of comma-separated items as a value. To change a value simply PUT a 
new list into the appropriate global variable. You must always provide the entire list of values, including 
all '.he values which you are not changing. 

Each time that the MarkUp XFCN is executed, it examines the values of each of these globals. If the value 
is empty, then the global is ignored and the default values built into MarkUp (as indicated immediately 
below) will be used. If the value is non-empty, then the contents of the global will be read into the 
appropriate PASCAL tables and variables before the markup analysis is begun. Hence, you may revert to 
the default values of the parameters at any time by simply PVTing empty into the appropriate global. 

Note that these parameter values are "sticky": once you have PUT a value into a global, it will continue to 
be effective until you change it, or until you leave HyperCard. You do not need to reset these values each 
time you call the MarkUp XFCN. Of course, it will not hurt anything if you do so, except to slow things 
down a bit 

IMPORTANT: The information from these global variables is converted into PASCAL strings which 
cannot be more than 255 characters long. Hence, never put more than 255 characters of text into these 
variables. 

Each of the four global variables expects to receive a list with a very exact format, as explained here: 

theMarkUpPunctuation This string of characters determines what characters MarkUp 

will consider to be punctuation marks if they occur in the 
student's response. Its default value is ("<>;:()[ ] <>? ! " 
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theMarkUpPunctuation This string of characters determines what characters MarkUp 

will consider to be punctuation marks if they occur in the 
student's response. Its default value is ( "<>; : ( ) []<>?!" 
& space & return ). If you redefine the punctuation set 
don't forget to include the space and return characters. 

theMarkUpSymbols This string is a list of 12 characters which determine the 

symbols used to display the error markup. The default value 
of the string is "_ _~XA«x\->< [ ". Each position in the 
list corresponds to a particular type of error 



1 


addcap 


it it 


underscore 


2 


dropcap 


it it 


underscore 


3 


accen terror 


ii „ u 


tilde 


4 


extraword 


"X" 


capital x 


5 


missingword 


"A" 


capital delta 


6 


moveword 


"«" 


double left arrow 


7 


extraletter 


"X" 


lower case x 


8 


missingletter 


11 ^ If 


backslash 


9 


substituteletter 


» =K» 


equal sign 


10 


transposeletterl 


If ^ If 


left angle bracket 


11 


iransposeletter2 


If ^ It 


right angle 








bracket 


12 


runonword 


" [" 


left square bracket 



The shapes shown are those which display in courier font. If 
you use a font other than courier, you may need to change 
some of the characters in this list, selecting appropriate 
characters from the font that you are actually using. 

theMarkUpweights This is a list of nine comma-separated numbers which 

determine how spelling errors are computed. The default value 
of the is "20,20,30,20,1,1,0.67,0.35.0.2" The meaning of 
each position is 



1 


winsert 


20 


2 


wdelete 


20 


3 


wchange 


30 


4 


w transpose 


20 


5 


wcap 


1 


6 


waccent 


1 


7 


cutoff 


0.67 


8 


prop_errors 


0.35 


9 


runon_criterion 


0.2 



The names in the second column are the PASCAL variable 
names used internally by the MarkUp XFCN. The first six 
numbers are the costs or weights attached to (respectively) 
letter insertion, omission, substitution, transposition, 
capitalization errors, and accent errors, when matching for 
spelling errors. The last three numbers have the following 
meanings: 

cutoff: Ratio of word lengths (shorter/longer) must exceed this 
value, or the edit distance between them will automatically be 
set to infinity (relevant only when the "shortcut" input 
parameter is set to True). 
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prop_errors: Normalized edit distance between two words must 
be less than this value, or the two words will be considered 
non-matches 

runon_criterion: Maximum edit distance which can exist 
betwej n the concatenation of two adjacent model words, M and 
M\ and a response word R, if R is to be considered as a 
candidate match for M and NT run together. 

theMarkUpPhonMatrix This is a list of 25 (=5x5) comma-separated integer values. 

The first five values correspond to the first row of the 
phon_matrix; the next five to the second row, and so on. 
Item number R of row number M specifies the cost of 
replacing a model character of category M by a response 
character of category R. I. e., the list of entries is in this 
order 

(ml rl)(ml r2) (mlr3) (ml r4) (ml r5)(m2rl) (m2 r2) 
(m2 r3) .... 

theMarkUpCharlnfo Defines character properties such as case, base character, 

diacritic, and phonetic category. If you are using special 
character sets or special alphabets or keyboards, you may need 
to change this information. The values you provide here will 
be read into various PASCAL arrays internal to the MarkUp 
XFCN code resource. How to change these tables is described 
in the next section. 



CHANGING THE CHARACTER INFORMATION TABLES 



As explained above, the default character information tables may be modified by placing information into 
the global variable theMarkUpCharlnfo. However, the information there must be formatted in a 
precise way before it can be used by MarkUp. Each HyperCard line in theMarkUpCharlnfo must 
contain information of one of four types: base character, diacritic, case, or phonetic. 

IMPORTANT: Each HyperCard line of theMarkUpCharlnfo will be placed in a PASCAL suing; 
hence no line should ever exceed 255 characters. If you have too much information to fit on one line, use 
additional lines for the remaining information. 

The format for each type of information is as follows: 

base character info: b,CHAR,x x x x ... 

Here "b" is a switch which informs MarkUp that die following 
information concerns base character. CHAR is a base 
character, and x x x x ... stands for a list of characters with 
diacritics which have char as their base character, e. g., 

b,e,6 b e e" E 
b,i,f 1 i i 
b,c,c 



Notice that case, like base and diacritic is an character attribute, 
so all base characters should be specified as lower case 
characters. 

diacritic info: d.DIACRIT.x x x x ... 

Here "d" is a switch which informs MarkUp that the following 
information concerns diacritics. DIACRIT must be one of the 
diacritic values: acute, grave, circumflex, 
dieresia, aupero, cedilla, tilde, macro, x x x 
x ... stands for a list of characters which have that type of 
diacritic, e. g., 

d^tcute4 6 ( <5 u 
d,grave,a 6 i 6 u 
d,cedilla,c C 
d,dieresis,a e" i 0 ii 

case info: c,CASE,x x x x ... 

Here "c" is a switch which informs MarkUp that the following 
information specifies character case. CASE must be one of 
the two case values up_case or down_case, and x x x x ... is 
1 list of characters that have that attribute, e. g., 

c,up_case,A BCDEFGHIJKLMNOPQRSTUV 
WXYZ 

c,down_case,a bcdefghijklmnopqrstuvwxyzO 
12345 6789 

phonetic info: pJPHON.x x x x ... 

Here "p" is a switch which informs MarkUp that the following 
information specifies "phonetic" properties. PHON is one of 
the five values vowel, consonant, phon3, phon4, or 
phon5; the list x x x x ... is a list of the characters which 
have that attribute, e. g., 

p,vowel,a e i o u y 

p.consonant.b cdfghjklmnpqrstvwxz 

Phonetic information is used to adjust the edit distances 
assigned for mismatched letters. 

Regardless of the type of information, the first two items in each line must be separated by commas. The 
remaining entries in the line may be run together or separated by one or more spaces for readability (as in 
these examples). There can be any number of lines, and the different types of information can be mixed in 
any order. 



INFORMATION RETURNED BY MARKUP 



The MarkUp XFCN directly returns the graphical markup string, which can simply be displayed beneath the 
student's response. (Note, however, that unless the markUpNeeded input value was True, MarkUp will 
return an empty string.) 
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If there was some problei. which prevented MarkUp from carrying out judging in the usual way, the 
judging will be aborted and an error message will be returned in place of the usual string. This error 
message will give a brief description of the nature of the problem and will always be prefaced by a single 
"%" character. This will be true even if no markup was requested. Hence, HyperCard can look at the first 
character of the MarkUp XFCN return to see whether the return is an actual markup string or an error 
message and act accordingly. 

The MarkUp XFCN may also return information in four other global HyperCard variables: 



theMarkUpReturnValues 



t heMa r kUpMa p s 



A list of comma-separated items. Usually, the first item will 
be "True" (if the response was judged OK), or "False" (if it 
was judged NO). The remaining items contain additional 
information about the match. (If spellingOnlyNeeded 
is not "x", however, the return will be different.) This 
information is returned every time MarkUp is called. 

If markup maps were requested, they are placed in this variable, 
a response-to-model map in the first line, and a model-to- 
response map in the second line. This information is returned 
only if the input parameter markUpMapNeeded is set to 
True. 



theMa rkUpParamDispl ay 



theMa r kUpDebug 



If a display if judging parameters was requested by setting 
parameterDisplayNeeded one of the non-default 
options, then the requested information is returned in this 
variable. 

Reports technical information on the operation of Markup. 
This global is intended for development purposes and is created 
by MarkUp only if the debugNeeded is turned on. 



IMPORTANT: If you have not requested the map or parameter display information, then the values of the 
three global variables theMa rkUpMaps, theMarkUpParamDisplay and theMa rkUpDebug are left 
unchanged. Specifically, they will not be set to empty, and the information they contain may be out of 
date. It is up to your HyperCard script to make sure that any information you read from these globals is up 
to date. In contrast, the values in theMarkUpReturnValues are updated each time MarkUp is called, 
whether you request it or not, hence they are always current. 

The information returned in these global variables can be inspected visually with the HyperTalk debugger 
facility, or by PUTing a copy into a field. Or it can be read directly by your scripts. Neither you nor your 
program will be able to make any sense out of the information returned, however, unless you know how it 
is formatted. The details of formatting are explained in the following paragraphs. 

theMarkUpReturnValues always returns information about the match between model and response. 
The nature of the information, however, depends on the value of the input parameter 
spellingOnlyNeeded. 

If spellingOnlyNeeded was "x" (the normal case), indicating a standard word-to-word matching, then 
theMarkUpReturnValues returns a comma-separated list of four values: judgedOk, pMatched, 
pNoninversion, and aveDist. The meaning or these items (in the order they appear in the returned list) is: 

judgedOk This item will be "true" if the response matched the model, or 

"false" if it did not A match is defined relative to the current 
values of tolerance for misspellings, extra words, and word 
order. This return value can be used to make decisions about 
feedback and branching after a response has been judged. 
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However, if something goes wrong during the judging process, 
so that judging could not be completed in the normal manner, 
judgedOk will be set to "false". Consequently, you should not 
use this value to make instructional decisions without also 
checking the markup string to see if an error occurred. 



pMatched 



This value, which ranges from 0.0 to 1.0, measures the 
proportion of words matched. It is computed by dividing the 
number of matched words by the total number of word types 
(non- identical words) in the model and response combined, 
excluding ignorable words. Equivalently, it may be thought of 
as the cardinality of the intersect of the set of model words and 
the set of response words, divided by the cardinality of their 
union. 



pNoninversion 



This value, which ranges from 0.0 to 1.0, measures the 
proportion of words which are in correct order by dividing the 
number of inversions in the solution into the total number of 
non-ignorable words in the response. Unmatched words 
(including ignorable words) are excluded when computing this 
inversion count 



aveDist 



This value, which ranges from 0.0 to 1.0, is computed as the 
average edit distance between model-response words pairs 
which were actually matched. It provides a measure of how 
well the model fits the response with respect to spelling. 



If the value of spellingOnlyNeeded was " r" or "p" which simply requests a least-cost edit trace for 
the model and response strings, then theMarkUpReturnValues returns a list of two comma-separated 
items: rawEditDi stance, normalizedEditDistance 

Remember that the values in theMarkUpRetum Values are returned automatically each time you call 
MaricUp. You do not need to request this information, and indeed you cannot prevent it from being 
computed and returned. 

theMarkUpMaps returns information about how the response words are matched with words in the 
model. An example will clarify this: 

Model: The quick brown fox (jumped leaped] over the lazy dog 

Response: The brown quick fox walked over the big iazy dog. (19) 

If a markup map is requested for this model and response, the value returned in theMarkUpMaps will be 



The first line of (20) is a response-to-model map. It will have as many comma-separated items as there are 
words in the response. The first position of this list corresponds to the first response word, the second 
position to the second response word, and so on. The number in each position tells which model word is 
paired with that position. In case no model word is paired with a particular response word, 0 is returned in 
that position. Thus, in the above example, line 1 indicates that response word 1 goes with model word 1, 
response word 2 goes with model word 3, response word 3 goes with model word 2, response word 5 is 
unmatched, and so on. 

The second line of (20) is a model-to-response map. It will have as many comma-separated items as there 
are word positions in the model (a synonym list counts as one word position, and an ignorable word list 



1,3,2,4,0,6,7,0,8,9 
1,3,2,4,0,6,7,9,10 
1,5,11,17,21,28,33,37,41,46 



(20) 
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does not count at all). The first list position corresponds to the first model word, the second to the second 
model word, and so on. The value in each position tells which response word is paired with that position 
in the model. If no response word is paired, this fact is represented by the presence of "0" in that position. 
In the example above, line 2 indicates that model word I is associated with response word 1 , model word 2 
with response word 2, model word 3 with response word 2, model word 5 is unmatched and so on 

The information in the two maps obviously overlaps, and in fact when there are no missing or extra words 
they give identical information. But because either model or response words may remain unpaired, both 
maps are needed to completely specify the match. 

The third line of (20) contains information about the starting character number of each response word, as 
computed by Markllp. For example, the first word occupies characters 1 -4 of the response, the second word 
characters 5-10, and so on. This is different from the HyperCard definition of a word, because MarkUp 
includes the space or other punctuation mark immediately preceeding a word as belonging to that word, as 
well as trailing spaces up to the next word. These pointer can be used not only to pull individual "words" 
out of the response string, but also the markup substring which corresponds to that word out of the markup 
string. Note, however, that the markup string has an extra leading character (to accomodate a symbol for 
missing words at the beginning of the response), and possibly an extra trailing character (to accommodate a 
symbol for missing words or letters at the end). 

The global variable theMarkUpParamDisplay will return different kinds of information depending on 
how you set parameterDisplayNeeded when MarkUp was called. In every case, there will be at 
least two lines. The first will consist of the string "MUParamDisplay" immediately followed by a single 
letter which identifies the type of information. Successive lines contain the actual information. (The 
following examples show the information which will be returned when all the default values are in effect.) 

If you use "v" as an input value of paramDisplayNeeded, the return will be of this form. 



MUParamDisplay v 

Markup XFCN 2.0 18 Aug 93, 12:47 PM - R. Hart UI/UC Languaqe Learninq Laboratory 



the second line of which identifies the MarkUp XFCN version number and by date and time of compilation. 

If you use "b" as an input value of paramDisplayNeeded, you will get back information in 
theMarkUpParamDisplay about the base characters corresponding to various characters, in this 
format 



MUParamDisplay b 

A~a, A=» a , C=c, £=e, R=n ,0=o, 0=u, a=a, a=a , a=a, a=a , a=a , a=a, ?"C, e=e, e=e, e=e, e=e, i = i, i=i, i=i, i = 
i, ft=n, 6=o, 6=0,6-0,6=0, 6-0, Ci=u, ii=u,0=u, u = u , A=a, A=a,0=o, y=y, ?=y, A=a, £=e, A=a, E=e, £=e, i = i, 
I»i, I-i, I-i,6-o,6"o,6-o,O=u,C=u,0=u 



Here each HyperCard item is of the form C=B. This denotes that the base character of C is B. Only those 
characters which are actually accented appear in this list. If a character is not in the list, this means that is 
has no accent and thus is its own base character. 

If you use "d" as an input value of paramDisplayNeeded, then you will get back information in 
theMarkUpParamDisplay about the diacritic of each letter, in a format similar to that used for base 
character: 



MUParamDisplay d 

A=4, C=7, £=l,R=8,6=4,0=4,a=l, a=2,a=3,a=4, a = 8,<;=7, e = l, 6 = 2, <§ = 3, e = 4, 1 = 1, i=2, 1 = 3, i = 4 , fl = 8, 6= 
1, 6-2, 6=3,6=4,o=8,u = l. ii=2,0=3, ii = 4 , A=2 , A=8 , 0=8, y=4 , ? = 4 , A=3, £=3, A=l, £=4, £ = 2, 1-1, != 3, 1 = 4, 
1=2,6=1,0-3,6=2,0-1,0=3,0=2 



C 6 



Each HyperCard item will be of form C=N, where C is a character, and N is an integer between 1 and 14. 
Each integer represents the value of a diacritic, thus 



0 


no_accent 


1 


acute 


2 


grave 


3 


circumflex 


4 


dieresis 


5 


umlaut 


6 


supero 


7 


cedilla 


8 


tilde 


9 


subdot 


10 


superdot 


11 


subhat 


12 


superhat 


13 


subhook 


14 


macron 



This set of diacritics is hard-coded into the PASCAL program for MarkUp and cannot be modified without 
changing the list of diacritic__variants in the PASCAL global TYPE declaration. 

Characters which are not accented will not appear on the list. If a character ; s not on the list, this means 
that it has the default diacritic type 0 = no_accent. 

If you use "c" as the input value of paramDisplayNeeded, then you will get back information in 
theMarkUpParamDi splay in this form: 



MUParamDisplay c 

A, B,C, D, E, F, G, H, I, J, K, L, M, H, 0, P, Q, R, S, T,U, V, W,X,¥,Z 



The second line is a comma-separated list of all the characters which are currently classified as upper-case. 
Only the upper-case characters are displayed. If a character is not on this list, it is classified as a lower-case 
character. 

If the input value of paramDisplayNeeded was "h", then information about the phonetic value of the 
characters will be returned in theMarkUpParamDisplay: 



MUParamDisplay h 

-1, -1, "1, =1, = 1,W, =1, = 1, »1, = 1, =1, np=l, para=l, =1, =1, =1, =1, =1, =1, =1, 
= 1, =1, =1, =1, =1, -1, =1, =1, =1,-=1,-=1, =1, !=1,"=1,# = 1,S = 1,% = 1,& = 1, '=1, ( = 1,) =1, 
*=1, +=1, ,=1, h=l, .=1, /=1, 0=1, 1=1, 2 = 1, 3=1, 4 = 1, 5=1, 6=1, 7 = 1,8=1, 9=1, :=1, ; = 1, < = 1, ==1, >'l, 
?=1, 8 = 1, A=0, B=1,C = 1, D=l, E=0,F=1,G=1, H-l, 1 = 0, J-l, K=l, L=1,M=1, N=l,O=0, P= 1,0=1, R = l, S=l, 
T-l, U-0, V-l, W-1,X=1, V-0, Z = l, [-1, W, ]*1, *«1«_-1, " -1, a=0,b=l, c=l,d=l,e=0, f = 1 , <j-» 1 , h= 1 , 
i = 0, i-J , k-1, i«l,m«l,n»l, o-0,p-l ,q-l , r-1 , s-1 , t-l , u»0 , v-l , w-1 , x-1 , y-0, /.« 1, ( = 1, 1=1,1 = 1, 
— 1, -1,A-0,A=0,C=1,E-0,R---I,0»0,0=0,a=0,a»0,a=0,a-O,a=0,a=0,g=l,e=0,e=0,e=0,e=0, 1 = 0, 
1-0, 1-0, i-0, 6=0,d-0,6=0,6=0,i5=0, 0=0, 0 = 0, 0 = 0, U = 0 , ! = 1 , °= 1 , « = 1 , £= 1 , §= 1 , • = 1 , 1= 1 , 11 = 1 , 

®=i,©-i,™=i, - = i, - = i, *-i, «=o, 0=0, «-i, ±=1,5=1,2=1, y»i,u,-i,a=i,2-i, n-i.tt-t-J-i-'-i.'-i. 
n=l,ae=O,0=O,i = l, ;=l,-.= l,V=l,/ = l,-=l,A=l,«=l,»=l,...= l, = 1 , A= 0 , A=0 , 0=0 , (F.-0 , os-- 0 , - = 1 , - = 1 , 

u =i, «=i, ' =1, +=i,o=i, s>-o, v=o, •'=1,0=1, < =1, >=i,n=i,n=i, *=i, -=i, , =i, „=i,-a=i, a=o, e=o, 
A=o,£=o, E-o, 1=0, 1=0,1-0, i=o,6=o,6=o,rt=i,6-o, o«o,0-o, u=o, 1=1, -=1, ■ =1, - = 1, * = l, * * 1 , 

rh'-h ."V" 1 : , 

Here the second line consists of a list of comma separated items of form C=P. C is a character and P is an 
integer value which corresponds the phonetic category of C. The current possible values of P are 

0 vowel 

1 consonant 

2 phon3 



24 



3 phon4 

4 phonS 



The phonetic value of every character is returned, in character-code order. In courier and most other fonts, 
the first 32 characters are control characters and will display as blank boxes. The 44th character, which is 
the comma, displays this way 

,,=i, 

thus creating a spurious empty item at the 44th position. This must be taken into account when using the 
HyperCard ITEM chunk designator to parse this information. 

If you enter "p" as the value of paramDisplayNeeded when you call MarkUp, then you will get back 
in theMarkupParamDisplay a list of all the characters that count as punctuation. This example 
displays the default value for the set of punctuation characters: 

MUParamDisplay p 
!(),.:;<>? I) 



The list occupies two lines because the first character is a return character which displays as a carriage 
return/line feed. The second character is a space, and the third an exclamation mark. 

If you enter "w" as the value of paramDisplayNeeded when you call MarkUp, then a list of nine 
comma-separated items will be returned in theMarkupParamDisplay : 



MUParamDisplay w 

20, 20, 30, 20, 1,1,0. 35, 0. 67, 0.2 



These nine items represent the values of the following parameters which are used in the spelling analysis 
(the values shown in this example are the default values): 



item 1 


w insert 


20 


item 2 


wdelete 


20 


item 3 


wchange 


30 


item 4 


wtranspose 


20 


item 5 


wcap 


1 


item 6 


waccent 


1 


item 7 


cutoff 


0.67 


item 8 


prop_errors 


0.35 


item 9 


runon_criterion 


0.2 



Entering a value of "f " for pa ramDisplayNeeded will cause theMarkupParamDisplay to 
contain information on the various judging flags in this format: 



MUParamDisplay f 

false, false, false, true, true, true, true, false, f 



The second line contains a list of nine comma-separated items which control the way judging is performed 



r - 
CO 
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item 1 


anyOrderOk 


item 2 


exttaWordsOk 


item 3 


misspellOk 


item 4 


woraMaiKUpweeaea 


item 5 


runtogetherNeeded 


item 6 


adjustNeeded 


item 7 


shortcut 


item 8 


markUpMapsNeeded 


item 9 


pararnDisplayNeeded 



Finally, if you use " s " as an input value of paramDispIayNeeded, then 
theMarkUpDisplayParams will contain a 12-charactcr list of all the markup symbols: 



MUParamDisplay s 
+ — XA«x\»>< [ 



The meaning of a character depends on its position in the list: 



char 1 


addcap 


n it 


underscore 


char 2 


drppcap 


•I it 


underscore 


char 3 


accen terror 


H ^ ii 


tilde 


char 4 


extra word 


"X" 


capital x 


char 5 


missingword 


"A" 


capital delta 


char 6 


moveword 


"«" 


double leftward arrow 


char 7 


extraletter 


"X" 


lower case x 


char 8 


missingletter 


"\" 


backslash 


char 9 


subsututelctter 


II — It 


equal sign 


char 10 


transposeletterl 


">" 


left angle bracket 


char 11 


transposeletter2 


"<" 


right angle bracket 


char 12 


runonword 


., [„ 


left square bracket 



If you use "m" is the value of paramDispIayNeeded, theMarkUpParamDisplay will contain the values of the 
phon_matrix matrix discussed earlier. Since there are 5 possible character categories, phonjmatrix is a 5x5 
matrix and contains 25 entries. The first entry corresponds to row 1, column 1; the second entry to row 1 , 
column 2; ... the sixth entry to row 2, column 1, and so on. Row M, column R contains the cost of 
replacing a character of type M by one of type R: 



MUPararaDJ ^play m 

30, 3 6, 36,36, 36,36, 30,36, 3 6,36, 3 6, 36, 30, 36, 36, 36, 36, 36, 30, 36, 36, 3 6, 36, 3 6, 30 



EXAMPLE 1: MARKING UP A RESPONSE IN HYPERCARD 



Here is a simple annotated example of how to use MarkUp to do the answer judging in your own stack. It 
assumes that the current card has a card field called "prompt" where a question is displayed, and a second card 
field called "response" where the student will type in a response to the question. A third card field named 
"markup" must be located below the "response" field. It will be used to display the markup feedback. If 
card field "markup" does not exist, it can be created and positioned by executing the setUpMarkUp handler, 
as explained below. 

The following handlers should be placed in the card script if MarkUp is only needed on one card. If MarkUp 
will be needed throughout the stack, put these handlers in the stack script and change the openCard and 
closeCard handlers to openStack and closeStack handlers. 



a 
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Before you can use MarkUp, you must attach the stack which contains the MarkUp XFCN. This stack is 
named "markUp XFCN 2.0" in the software distribution of MarkUp. Besides the code resource which 
implements MarkUp, the stack contains in its stack script a number of handlers useful for integrating 
MarkUp into HyperCard programs. 



on 


openCard 




— If the markup XFCN stack is located in some other folder, change 




liic pa l 1 1 oc uu l a J.I1 yx y * 




start using stack "myFolderrmarkUp XFCN 2.0" 




— This enables use of the markup XFCN. Parameter value "response" is 




— the name of the fielH where the student will type in a response. 




-- It must be a CARD field. 




setUpMarkUp "response" 




— Ask a question to elicit a written response. In a real drill program, 




-- this would be done somewhere other than in OPENSTACK, e. g., in 




-- the handler which presented the next drill item. 




put "Type in the French words for the numbers 1 to 10." into card field 


"prompt" 


end 


openCard 


on 


closeCard 




— markup XFCN uses a lot of space, so disconnect it as soon as it's not 




needed. 




stop using stack "myFolder :markUp XFCN 2.0" 


end 


closeCard 



C i 



on judgeResponse 

-- This handler contains the commands to to the judging and the markup. 

— It must be called from the response field. 

— The following globals MUST be declared in any handler that calls 

— the Markup XFCN, because Markup may examine their values with callbacks. 

global theMarkupReturnValues, theMarkUpSymbols, theMarkUpPunctuation, -. 
the MarkUpParameters, theMarkUpCharlnfo, theMarkUpParamDisplay, ~i 
theMarkUpMaps, theMarkUpDebug 

— Erase any previous markup. 

put empty into card field "markup" 

-- Copy the correct answer string into a variable. In a real drill program, 

— this might be done in the handler that present the item. 

put "un deux trois quatre cinque six sept huit neuf dix" into model 

Execute the Markup XFCN and store markup string which is returned, 
put markUp( model, card field "response" ) into markUpString 

--Use returned values to generate feedback display, 
if ( item 1 of theMarkupReturnValues - "True" ) then 

-- If resonse was judged correct, no markup required. 

put "OK" into card field "markup" 

el se 

— If response had errors, display markup string, 
put markUpString into card field "Markup" 

end if 
end judgeResponse 



In addition the following handlers must be placed in the script of the card field where the response is typed 
(in this example it will be card field "response"): 



on returnlnField 

judgeResponse 
show card field "markup" 
end returnlnField 



on keyDown ch 



put the selectedchunk into s 
hide card field "markup" 
select s 
pass keyDown 



end keyDown 



This returnlnField handler causes response judging to begin as soon as the student presses the 
RETURN key. The keyDown handler makes the markup disappear as soon as the student begins to edit 
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the response. It is important to do this because when the response string changes, the markup becomes 
invalid ( e. g., if the student deletes letters, some markup symbols may no longer be under the right letters). 

Executing the setUpMarkUp handler changes the default font of the response field to be "Courier" (but 
does not change any other text properties). This change to a fixed-width font is required so that the markup 
symbols will be properly alligned beneath the letters of the response. It is not essential to use Courier; any 
non-proportional font will do, but setUpMarkUp must be modified if you want to use some other font. 
If you use the MarkUp XFCN only to judge the response and do not intend to display the error markup, 
then you need not execute setUpMarkUp at all. 

IMPORTANT: SetMarkUp installs the "markup" field behind and slightly below the response field. It 
assumes that the response field is only a one line deep, and that it has been shaped so that tlie bottom of the 
field is immediately beneath the longest descender. If your response field is not configured this way, the 
markup characters may be completely hidden by the bottom of the response field. 

For expository simplicity, the judge'Response handler above supposes that the judging proceeded 
normally. In actual courseware, the markup string should always be checked so that error conditions such 
as too many letter in a word or words in a sentence can be detected and the student informed that the "NO" 
judgment was due to a special problem. Whenever there is some problem which prevents judging from 
being completed, MarkUp returns, instead of the normal markup, a string which begins with the character 
"%". The remainder of the string gives a brief description of the problem. To use this feature, modify the 
code in judgeResponse along these lines: 



— Use 


returned values to generate feedback display. 




if ( 


char 1 of theMarkUpString - "%" ) then — Check for 


error . 




delete char 1 of theMarkUpString -- Get rid of the 


"%" char. 




answer "There was a problem judging you answer:" & 


return & -i 




theMarkUpString & return i "Please try again." with 


"OK" 


else 








if I item 1 of theMarkUpReturnValues = "True" ) then 




— If resonse was judged correct, no markup 


requi red . 




put "OK" into card field "markup" 






else 






— If response had errors, display markup st 


ring . 




put markUpString into card field "Markup" 






end i f 




end if 







EXAMPLE 2: MODIFYING THE MARKUP AND PUNCTUATION LISTS 



The following HyperCard program segment illustrates how the details of judging and the appearance of the 
markup can be manipulated by resetting the global variables theMarkUpPunctuation and 
theMarkUpSymbola : 



BEST COPY AVAILABLE 



ERIC 
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global theMarkUpSymbols, theMarkUpPunctuat ion 



-- Change some of the markup symbols. New symbols are: 

— "?" - extra word, "*" » make upcase, - make downcase, "°" ■ extra letter, 

— = transposed letters, "-" - incorrect letter, "'" = missing letter. 

put --?A« 0, [" into theMarkUpSymbols 

-- Change punctuation so that a hyphen will be treated as a word separator, and 
-- the final period will be judged. 

put ( "-?!,:;()[]" i space C return ) into theMarkUpPunctuat ion 



These commands can be executed any time before the MarkUp XFCN is called. They changes which they 
cause will persist until you reset the global variables again. The above modifications in the markup 
symbol and punctuation lists will result in markups like the following: 

Model: The quick brown fox jumped over the lazy dog. 

Reponse: the qick prown foxx jumpde the big-Lazy dog over 

Markup: - 0 ,w, A ??? - ~ « (21) 

The spelling- and case-error symbols have been chosen so that they are smaller and higher in the line than 
the defaults. This results in a less cluttered spelling markup and clearer visual distinction between the 
spelling and the word-order symbols. On the other hand, the meaning of the spelling symbols may be 
somewhat less obvious. Notice that the omitted period at the end of the response is now marked as a 
missing character, and that "big" and "lazy" are judged as separate words, in keeping with the changes made 
to theMarkUpPunctuation. 

The spelling and word-order markups can easily be separated by changing the values of the input parameters 
misspellOk, anyOrderOk, and extraWordsOk. This permits the two types of errors to be dealt with 
separately by modifying the judging in example 1 in this way: 
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global theMarkUpReturnValues 



— Allow word order and extra word errors. Such errors will not be judged or marked 

— during this call to Markup. 

put markup (model, response, , True , True) into spellMarkUp 
put item 1 of theMarkUpReturnValues into wordOrderOk 

— Allow spelling errors. Such errors will not be judged or marked during 

— this call to Markup. 

put markup (model, response, True, , ) into wordOrderMarkUp 
put item 1 of theMarkUpReturnValues into spellingOk 

— If there were any word order or extra word errors, mark them up. 

— Otherwise, mark any spelling errors. If no errors of either kind are 

— present, judge OK. 

If wordOrderOk = "False" then 

put "First, correct your grammar problems" into card field "feedBack" 

put wordOrderMarkUp into card field "markup" 
else if spellingOK » "False" then 

put "No. Let's look at your spelling errors." into card field "feedBack" 

put wordOrderMarkUp into card field "markup" 

else 

put "OK" into card field "feedBack" 

ena if 



EXAMPLE 3: JUDGING WITH MULTIPLE RIGHT AND WRONG ANSWERS 



A common instructional situation is for a question to have several alternate correct answers. In addition the 
courseware author may have anticipated several incorrect answers, each of which requires its own specific 
feedback. Of course, the student's typed response may not exactly match any of the anticipated (correct or 
incorrect) answers, due to misspellings and other errors. An adequate response analysis requires matching 
the response against each of the models and determining which one provides the closest match. The 
MarkUp XFCN returns three numbers which measure goodness of match: pMatched (percent of words 
matched), pNoninv (percentage of words in correct order), and aveDist (the average edit distance between 
words in matched pairs). Whenever Markl is called, these numbers are returned as items 2, 3, and 4 of the 
HyperCard global variable theMarkUpReLurnValues. 

To determine best fit, these numbers must be combined to provide a single goodness-of-fit metric. Of 
course, a response which is judged "OK", and is thus error-free relative to the current settings of the 
extraWordsOk, wrongOrderOk and misspelledOk flags, should always fit better than any response which is 
judged "NO'; beyond this, however, the relative contribution of these three factors in providing an intuitive 
good fit is not clear. Further empirical study is required but has not yet been undertaken. Lacking data, we 
can as a first approximation suppose that all three factors are weighted equally, so that the metric will be 



BEST COPY AVAILABLE 
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goodnessOfFit := 



1.0, if judgedOk is True 

( 3*pMatched *(1 - aveDist) + pNoniv)/4, if judgedOK is False 

(22) 



Here, 1 has been subtracted from aveDist so that values will range from 0.0 (no fit) to 1.0 (perfect fit), as 
with the other two quantities. The resultant value of goodnessOfFit will vary between 0 and 1.0, although 
that is not crucial for the application we are developing here. 

Supoose now that we have these data for a single question contained in a card field named "item": 



These data are formated as follows: the first line is the prompt. Specifications for the correct and wrong 
answers, and for feedback, follow on the remaining lines. The end of the data is marked by a "#" symbol in 
column 1. A correct answer must occupy only one line and is indicated by the word "answer" as the first 
word of the line. Similarly, a wrong answer is indicated by the word "wrong" as the first word. All the 
lines that come after an answer but before the next answer (here indented for readability) are the feedback 
which will be shown if the preceeding answer is the best match for the response. 



on showPrompt 

— Copy data into global variable ITEMDATA and display prompt to student, 
global itemData 



Complete this sentence: Ice cream tastes 



than spinach. 




put card field "item" into itemData 

put line 1 of itemData into card field "prompt 



end showPrompt 
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eturn InField 

— Compair typed resonse to the answer in global var ITEM and find best match 

— Display feedback and markup which goes with best matched answer. 

— If no aswer is matched, display "NO" 

global ItemData 

-- Search ItemData for best ans. 

put flndBestAns( target, itemData ) into bestAns 

put item 1 of bestAns into bestFlt 

if ( bestFit = 0 ) then 

put "NO" into card field "feedBack" — No answer matched. 

else 

put item 2 of bestAns into bestHne 
put item 3 of bestAns into bestMarkUp 

put word 1 of line bestLine of itemData into polarity — Ans or Wrong 
if ( polarity - "answer" ) then 

put "OK" into card field "feedBack" — Matched correct answer. 

e I se 

put "NO" into card field "feedBack" — Matched wrong answer. 

end i f 

repeat with i = bestLine t l to number of lines in itemData -- Find feedback. 

if ( word 1 of line i of itemData is in "answer wrong *" ) 

then exit repeat 
end repeat 

put line bestLine + 1 to i - 1 of itemData at^r card field "feedBack" 



returnlnField 
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function flndBestAns response, ItemData 

-- Scan through all correct and incorrect answers in ITEMDATA and find 

— the one which matches RESPONSE best. 

-- ANSDATA must contain answer i feedback data, formated as shown above. 
-- Return is a list of three comma-separated items: 

-- Item 1: Goodness value of best-matched answer (0 if no match) . 

— Item 2: Line number within ITEMDATA of best matched answer. 

— Item 3: Markup string which goes with best matched answer 

global theMarkupReturnValues, theMarkUpSymbols, theMarkUpPu net uat ion , -i 
the MarkUpParameters, theMarkUpCharlnf o, theMarkUpParamDisplay, -i 
theMarkUpMaps, theMarkUpDebug 

put 0 into bestFlt 

put empty into bestLlne 

put empty into bestMU 

repeat with i = bestLlne + 1 to number of lines in itemData 
put line i of itomData into m 

if ( word 1 of m » "answer" ) or ( word 1 of m ■- "wrong" ) then 
delete word 1 of m 



put 


markup ( 


m, response ) into mu 




put 


item 1 


of theMarkupReturnValues into match 




if 


match 


- "True" ) then 






put 


item 2 of theMarkupReturnValues into 


pMatched 




put 


item 3 of theMarkupReturnValues into 


pNon I nv 




put 


item 4 of theMarkupReturnValues into 


aveDi st 




put 


( pMatched + pNonlnv + 1 - aveDist ) 


/ 3 into ansFit 




if 


i ansFlt > bestFlt ) then 
put ansFlt into bestFit 
put i into bestLine 
put mu into bestMU 






end 


if 




end 


if 







end if 
end repeat 

return ( bestFlt t "," & bestLine I "," i bestMU ) 
end findBestAns 



The f indBestAnsO function simply searches through the answer data looking for each "answer" or 
"wrong" specification. Whenever one is found, it is matched against the response. If there is a match and 
that match improves on the best of the previous matches, the current match is made the best one. 
Eventually all the answers are examined and the information about the best matched one is returned. 



EXAMPLE 4: USING THE MARKUP MAPS TO CONTROL VOCABULARY HELP 



When writing foreign language courseware, a simple graphic markup is often not specific enough as error 
feedback. If, for example, the student is ignorant of certain vocabulary words required by the response, a 
missing or unidentified word markup may not provide sufficient help. The handler below uses the markup 
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maps to see which words in the model are not matched by words in the student's response, and given 
vocabulary help on just those items. As before, we will suppose three card fields named "prompt", 
"response", and "markUp", and in addition one called "vocHelp", where vocabulary help will be displayed. 



on presentltem 

global correctAr.s, vocList 

put "Translate to French: She read the last ten page pages for us." 
into card field "prompt" 

put "Elle nous a lu les dix dernieres pages." into correctAns 

-- The items in this list correspond to words in the correct answer. 

put "Elle, nous, avoir, lire (irreg) , le/la, di.x, dernier, page (f)" into vocList 

setUpMarkUp "response" 

end presentltem 



on judgeResponse 



global theMarkupReturnValues, theMarkUpSymbols, theMarkUpPuntuation, -> 
the MarkUpParameters, theMarkUpCharlnf o, theMarkUpParamDisplay , -i 
theMarkUpMaps, theMarkUpDebug, correctAns, vocList 

— Request return of markup maps by setting markUpMapsNeeded to True., 
put markUp( correctAns, target ,,,,,,,,, True ) into markUpString 

— Use returned values to generate usual markup display, 
if I item 1 of theMarkUpReturnValues « "True" ) then 

-- If response was judged correct, no markup required. 

put "OK" into card field "markup" 

else 

-- If response had errors, display markup string, 
put markUpString into card field "Markup" 

end if 

-- Use markup map to generate additional vocabulary help, 
put line 2 of theMarkUpMaps into MtoRMap — Model-to-response map 
repeat with i = 1 to numoer of items in MtoRMap 
if ( item i of MtoRMap « "0" ) then 

put ( word i of vocList £ return ) into card field "vocHelp" 

end i f 
end repeat 

end judgeResponse 



on retu rn I nField 

— This handler must be in the script of card field "response" 

judgeResponse 
end returnlnField 
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EXAMPLE 5: USING MARKUP MAPS TO JUDGE LISTS 



Many questions solicit answers in the form of a list, for example, "Name the five Great Lakes", or perhaps 
"Name at least three of the five Great Lakes". In such cases, the order in which items are listed is not 
relevant, only the fact that they are present somewhere in the response. The first case, "Name the five Great 
Lakes" can be easily provided for by setting anyOrderOk (the 5th parameter slot) to True when the 
MarkUp XFCN is called; 

get markup ( "Michigan Superior Huron Algonquin Ontario", target , , , True ) 

Missing words, extra words, and misspellings will still be marked appropriately, but any order at all will be 
accepted. Punctuation, as usual, will be ignored when doing the judging, so the student may use spaces or 
any other kind of punctuation to separate words. Note, however that anyOrderOk operates on individual 
words, not phrases, so that a question like "Name the Dakotas" cannot be reliably judged using 

get markup ("North Dakota South Dakota", target, , , True) 

because answers like "South Dakota Dakota North" will be judged as correct. There is no way to define or 
manipulate phrases in version 2.0 of the MarkUp XFCN. 

The response to a question like "Name at least three of the 5 Great Lakes" can be handled efficiently with 
the help of the markup maps, using handlers like these: 



function countlnstances list, object 

— LIST must be a list of comma-separated items. 

— OBJECT is a value to match to each item (case and accent ignored) . 

— Return is number of times OBJECT occurred as an item in LIST. 

put 0 into count 

repeat with i « 1 to number of items in list 

if ( item i of list » object ) then add 1 to count 
end repeat 

return count 

end countlnstances 
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on judgeResponse 

global theMarkupReturnValues, theMarkUpSymbols, theMarkUpPunctuation, 
the MarkUpParameters, theMarkUpCharlnfo, theMarkUpParamDisplay, -i 
theMarkUpMaps, theMarkUpDebug 

put "Michigan Superior Huron Algonquin Ontario" into correctAns 
— Set anyOrderOk and markUpMapsNeeded to True. 

put markUp( correctAns, target ,, True, ,,,,, True ) into markUpString 

put line 1 of theMarkUpMaps into RtoMMap — Rfcsponse-to-model map. 

put line 2 of theMarkUpMaps into MtoRMap — Model-uo-response map. 

put countlnstaces ( MtoRMap, "1" ) into numCorrect 

put countlnstaces ( RtoMMap, "0" ) into Numlncorrect 

if ( numCorrect >« 3 ) and ( numlncorrect ~ 0 ) then 
put "OK" into card field " feedBack" 

else 

put "NO" into card field "feedBack" 

put markUpString into card field "markup" 

end if 
end judgeResponse 



The function count Iri3tace3 ( ) is used here to count both the number of lakes which are matched and 
the number of response words which are unmatched and thus do not correspond to any lake. If three or more 
lakes were matched, and there were no incorrect lake names, then the student has successfully answered the 
question. Otherwise, the markup is shown; this will mark any incorrect lake names as extra words. 



EXAMPLE 6: A COMPLETE VIEW OF JUDGING PARAMETERS 



When using MarkUp to develop new courseware, it is sometimes convenient to collect and view 
information on all of the judging parameters. This can be easily done with the following HyperCard 
function: 



function f ul lParamln f o 

global theMarkupReturnValues, theMarkUpSymbols, theMarkUpPunctuation, 
the MarkUpParameters, theMarUpJudgingTables , theMarkUpParamDisplay, -. 
theMarkUpMaps 

put "nbdchpvfm" into typeList 
put empty into r 

repeat with i = 1 to length! typeList ) 

get markUp(, ,,,,,,,,,, ( char i of typeList ) ) -- Puts info into global.. 

put ( theMarkUpParamDisplay i return i return ) after r 
end repeat 

return r 

end f u UParamln f o 
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Notice that the MarkUp XFCN is called with null model and response strings; since the object here is not 
to get a markup, but to retrieve information about the current markup parameters, entering strings to be 
judged is unnecessary. Only the 12th parameter, which specifies the tyt-* of information to be returned, is 
systematically varied by having the loop step through the list "nbdchpvfm". Each call returns a result 
which is appended to the temporary variable r. Calling this function and putting the return value into a 
field, e. g., 

put f ullParamlnfo () into card field "parameterDisplay" 

will yield a formated display like this one: 



MUParamDisplay n 

Markup XFCN 18 Aug 93, 12:47 PM - P.. Hart UI/UC Language Learning Laboratory 
MUParamDisplay b 

A=a, A=a, C=c, £=e, N-n, 0=o, 0=u,a=a,a=a,a=a, a=-a, a~a, a=a, c^c, e=e, e=e, e=e, e=e , i = i , i = i , I = i , 1- 
i, fi-n, 6-o, 6=o, 6=o, 8=o, O»o, u=u, u=u, Q=u, U=u, A=a, A«a, 0=o, y=y, ?=y, A=a, £=e, A=a, E=e, E=e, t = i , 
I-i, I = i, l>i, 6=o, 6=o, 6-o, 0=u, 0=u, 0=u 

MUParamDisplay d 

A=4,g>7, £=1, 8-8,0=4, 0-4, a-1, a=2, a=3, a = 4, a=8, c»7 , e™l , e=2 , e=3, e=4 , i = l , i=2 , 1 = 3 , 1 = 4 , fi = 8 , 6= 
1,6=2,6=3,3-4, o=8, u= 1 , u=2 , G=3 , U=4 , A=2 , A=8,0=8, y=4 , ¥=4 , A=3 , £ = 3 , A = l , E=4 , £=2 , 1 = 1 , 1 =3 , 1 = 4 , 
1-2,6-1,0-3,6-2, 0-1, 0=3, 0=2 

MUParamDisplay c 

A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T, U, V, W, X, Y, Z 
MUParamDisplay h 

= 1, =1, =1, =1, =1, W, = 3, =1, =1, =1, -1,«1, =1, =1, =1, =1, =1, =1, =1, =1, =1, 

= 1, =1, =1, =1, =1, =1, =1, =1, =i,=i, = i, !--»!,"= i, 1=:, $=:,%=:, s = :,• = !, <=: ,)-•!, - -i , 

+ »1, , -1, =1, . =1 , /-l, 0 = 1 , 1 = 1 ,2 = 1, 3 = 1 , 4 = 1 ,5 = 1, 6=1, 7 = 1, 8=1, 9=1, : = 1, ; = 1 , <- 1, ■-■ 1 ,> '•'. , ? 
, A-1, B=1,C»1,D=1 , E = l, F-l ,G=1, H-l, 1 = 1 , J = l, K=l, L=1,M=1, N=l, 0=1, P-l, Q=l, R= 1 , S» 1 , T= I, U = l , 
V=l, W=l, X=l, Y=l, 2=1, ( = 1, \=1, ] = 1, ~=»1, _=1, ■ =1, a=0, b = l,c = l,d=l,e=0, f = l,g = l,IW,i = 0,j = l, 
k-1, l=l,m-l, n=l, o=0, p= 1 , q=l, r= 1 , s= 1 , t-1 , u=0, v=l , w=l , x=l , y=0 , z = l , ( =1 , I =1 , i =1 , ~ = 1 , -1 , 
A=l, A=1,C=1, £=1, N= 1,6=1, 0=1, a=l, a=l, 3=1, a = l, a=l, a = l, c=l , e=l , e=l , e=l , e=l , 1=1, i = l , 1 = 1, 
1 = 1, A=l, 6=1, d=l, 6=1, o=l, 0=1, u=l, ii = l, 0 = 1, u = l, t-l,° = l,c = i, f>l,§=l, . = 1, <fl = l, 11=1, ®=1, ©=1 , 
™=1, ' = 1, "=l,»»-l,i»l,0=l,«-l,±=l,S=l,2=l,¥=l,Ji=l,3»l,X-l, 11= l,Tt=l,J=l, « = 1, ° = 1, 1 , ae= 1 , 

0*1, £=i, i = i, -i-i, V-i, /»;, — ;,a=;,«=i, »i, =i, A»i, a=i,6=i,ce=i,ce=i, - = i,-=i, "=i, "=i, 

' = 1, ' = l,+=l,0=l,y-l, V=l, --=1, n-l,< =1, >=l,fi=l,a=l, t-1, - = 1, , =1, „ = l,fc-l, A = l ,E = 1, A = l , E = l, 

£=1, 1 = 1, 1=1, 1 = 1, 1=1,6=1,6=1, #-i,6=-l, 0=1,0=1,0=1,1=1, ' = 1, - = 1, " = 1, - = 1, ' =1 , *=l, .=1, " = l, 
. = 1, ' = 1 

MUParamDisplay p 
!(),.:;<>?[) 

MUParamDisplay v 

20, 20, 30, 20, 1,1,0. 35,0. 67, 0.2 

MUParamDisplay f 

false, false, false, true, true, true, true, false, f 

MUParamDisplay m 
XA«x\->< [ 

MUParamDisplay t 

30, 36, 36, 36, 36, 3 6, 30, 3 6, 36, 36, 36, 3 6, 30, 36, 3 6, 36, 3 6, 36, 30, 36, 3 6, 36,3 6, 36, 30 



Even though the actual parameter information may wrap around several screen lines, it occupies only one 
HyperCard line. The single exception to this is MUParamDisplay p, the list of punctuation symbols, 
which will occupy two lines whenever the RETURN character is on the list of punctuation symbols. 
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Usually, one collects parameter information to inspect it visually and verify that the parameter values are as 
intended. Sometimes, however, the program itself may need to access the parameter values. If so, they can 
be readily extracted. The fact that each type of information occupies one line simplifies the process of 
parsing out information from the display. The following HyperCard function, which returns the base 
character corresponding to a specified input character, will work on either the simple base character or an 
aggregate of data like that shown just above. 

| ~ ~ 



function f indCharBase ch, paramData 

— PARAMDATA must contain formatted parameter display information. 

— Return is the base character corresponding to character CH . 

— If base info is not in the PARAMDATA, return EMPTY. 

repeat with i = 1 to number of lines in paramData 

if ( line i of paramData « "MUParamDisplay b" ) then 

put ( line i + 1 of paramData ) into bList -- Base char info 

put offset ( ch, bList ) into p 

if p > 0 — CH is in the list. 

then return char p + 2 of bList -- Every entry has 3 chars, 

else return ch — CH is not in list, so is its own base char. 

end i f 
end repeat 

return empty -- info on base chars not in da'ca. 

end findCharBase 



When card field "parameterDisplay" contains the data shown above, calling this function with input 
parameter "e" 

get findCharBase ( "e'\ card field "parameterDisplay" ) 

will return "e", which is the base character corresponding to "e". 

Accent information can be parsed out in the same way. Information on case, punctuation, and phonetics of 
any given character can be retrieved similarly. The following function is a predicate which will return 
True if CH is upper-case and False if it is lower-case: 
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function charlsUpperCase ch, paramData 

— PARAMDATA must contain formatted parameter display information. 

— Return is TRUE if CH is upper-case; FALSE if it is lower-case. 

— If case info is not in PARAMDATfc, return EMPTY . 

repeat with i » 1 to number of lines in paramData 

if ( line i of paramData = "MUParamDisplay c" ) then 

put ( line i + 1 of paramData ) into cList — Case info 
return ( offset! ch, cList ) > 0 ) -- Check if CH is on cLIST. 

end i f 
end repeat 

return empty -- Info on case not in data, 

end charlsUpperCase 



Assuming the data shown above, this call 

get char IsUpperCase ( "A", card field "parameterDisplay" ) 
will return True. 



EXAMPLE 7: USING THE EXACT SPELLING MARKUP 



Spelling is a skill which associates with the auditory image of each word a suitable visual image. The 
irregularities of English spelling do not in general permit the visual (graphic) associate to be fully predicted 
from the auditory form, so a student must often visually encode the conventional graphical icon which 
corresponds to each word, paying special attention to those areas of the visual image which are not 
predicable from phoneme-to-grapheme rules (Simon & Simon, 1973; Simon, 1975). To facilitate visual 
coding, it is important that the student not see incorrectly spellings, since these may be encoded into long 
term memory and interfer with the correct coding. To facilitate visual coding it is also useful to focus the 
student's visual attention on those areas of the word which have not yet been encoded correctly (i. e., which 
were misspelled). The correctSpelling function below returns a display designed to satisfy these two 
specifications. Incorrect and omitted letters appear as capitals, so that the student can focus attention on 
those areas of the word; extra letters in the student's response are replaced by which is relatively 
inconspicuous, but warns the student that her visual image was not correct in this region. (This English 
example neglects the possibility of capitalization or accent errors.) For example, here are the displays 
returned by various misspellings of the word "necessary": 

Response: nesessarey 

Display: neCessar*y 



Response: neccisary 
Display: necEsSary 

Since the mixture of upper- and lower-case letters is a very unusual image of word spelling, such display 
string should probably be further processed in HyperCard so that the upper-case letters are changed to lower- 
rase boldface, sized, and perhaps color coded and displayed via 32-bit quickdraw. The missing letter symbol 



can be further minimized by hilighting the letters on each side of the omission, then deleting the omitted 
character to give displays like these: 



neCessaX'Y 
neCesSary 

Such displays could be useful as feedback in a program that taught spelling by dicating individual words to 
students (preferably in the context of a full sentence). 



function correctSpelling model, response 
global theMarkUpReturnValues 

put markUp(model, response, , ,,,,,,,, "r") into markUpString 
put item 1 of theMarkUpReturnValues into judgement 

if judgment - "False" then 
put 1 into m 
put 1 into r 
put empty into w 

repeat with i = 1 to Length ( markUpString ) 
put char i of markUpString into c 
if c - "-" then — Chars match. 

put char r of response after w 

add 1 to m 

add 1 to r 
else if c ■ "\" then -- Hissing char. 

put upCase ( char m of model ) after w 

add 1 to m 
else if c • "x" then — Extra char. 

put "•" after w 

add 1 to r 
else if c = "=" then — Wrong char. 

put upCase ( char m of model ) after w 

add 1 to m 

add 1 to r 
else if c = ">" then — Transposition. 

put upCase (char m + 1 of model ) after w 

put upCase (char m of model ) after w 

add 2 to m 

add 2 to r 

else if c = "<" then — Already handled at ">" . 
end if 
end repeat 

end if 
return w 



end correctSpelling 



function upCase c 

— Returns the upper case version of an alphabet letter C. 

— If C is not a lower-case alphabet letter, return C unchanged. 

if "a" <- c and ( c <= "z" ) 

then return char ( charToNum( c ) - charToNumC'a") +• charToNumC'A") ) 
else return c 

end upCase 



Since individual words are involved, the MarkUp XFCN is called with the rawTrace parameter (slot 13) set 
to V. Tliis forces the response to be analyzed as a single string, even if there are leading, trailing, or 
internal spaces (leading and trailing spaces should be removed by your HyperCard script before the response 
is submitted to MarkUp). The raw edit trace is needed here to force a markup to be computed even when the 
misspelling is very bad, and because complete information about the misspelling is required to generate an 
accurate spelling correction. The markup characters are then processed one at a time. Whereever the 
response contains a missing or incorrect letter, the uppercase equivalent in the model is substituted, and 
whereever there is an extra letter in the response, it is replaced by a "«" character. When a character in the 
model matches the response, then it is shown in lower-case. 



TECHNICAL DETAILS AND LIMITATIONS 



The most current version of MarkUp (the one documented by this report) is Markup XFCN 3.0 - 19 Dec 
94,8:20PM. To determine the version you have, execute Markup () without parameters; a version string 
will be returned. The revision history of MarkUp can be found near the beginning of the PASCAL source 
file. The MarkUp XFCN is compiled as a THINK PASCAL project containing the following files, in the 
indicated order 

DRVRRuntime.Iib 

InterfaceJib 

HyperXCrndp 

HyperXLibiib 

MarkUpXFCN3.p 

It has been tested on a Macintosh Quadra 700, running under the Macintosh Finder 7.1 and 7.5 and 
HyperCard 2.1 and 2.2 launched with 2 megabytes of memory. It should, however, run under virtually any 
Macintosh configuration. The user should be aware of the following limitations on the MARKUP XFCN: 

Versions through 3.0 will not work properly with 16-bit char representations, i. e., with the 

Macintosh language extensions. 
Maximum number of letters in a single word: 22 

Maximum number of words in model (including synonyms but excluding ignorables): 18 
Maximum number of words in response: 18 
Maximum number of characters in model: 255 
Maximum number of characters in response: 255 

These limitations are not intrinsic to the MarkUp algorithm, but are imposed by the fact that the MarkUp 
code has to run in the limited space provided by the HyperCard stack. The compiled MarkUp XFCN project 
occupies a bit more than 42700 bytes of space; the MarkUp XFCN itself occupies about 22474 bytes. 
This is near the limit of the allowed size for HyperCard code resources. XFCNs borrow their space from 
the HyperCard stack, so if MarkUp is run in recursive or other deeply embedded contexts, there may not be 



sufficient stack space. Running the MarkUp XFCN in such a situation will cause the stack to overflow 
into the heap and will most likely cause a hard system crash type 28 (stack has moved into application 
heap), if not immediately, then soon thereafter, or at latest during exit from HyperCard. To guard against 
this, the markUpUsingParmsO function checks to make sure that there are at least 28500 bytes free on 
the HyperCard stack; if not, MarkUp is not run and an error dialog appears. (Since Markup's word-order- 
error algorithm is recursive, space requried is somewhat sensitive to the number of words in model and 
response, but 28500 should be sufficient to run with the maximum of 18 words.) If you call the primitive 
MARKUP XFCN, you should first use the HyperTalk function the stackSpace to assure that this much 
stack space is available. 

To conserve stack space, some large MARKUP array structures have been put into dynamic memory. The 
Mac Toolbox functions NEWPTRO and DISPOSPTR0 are used to allocate and deallocate this memory, 
which amounts to about 24K of space. If this much heap memory is not available, MARKUP aborts and 
returns the error message "%Couldn't get matrix memory." 

No formal speed testing was done since, even for maximum length sentences, MarkUp returns without 
discemable delay. 



AVAILABILITY 



The MarkUp XFCN is freeware. It can ordered on diskette for a Iiandling fee or accessed from FTP. 
Contact the author for further information. Copyright of MarkUp resides with the author, but you may use 
as a component of commercial or non-commercial software. If you do so, acknowledgment of the author 
and the Language Learning Laboratory of the University of Illinois at Urbana-Champaign would be 
appreciated. As freeware, MarkUp is offered as is, without any warranty of any kind. However, if you have 
questions or encounter problems in using the package, please contact 

Robert S. Hart, Associate Director 
Language Learning Laboratory 
University of Illinois at Urbana-Champaign 
G-70 Foreign Languages Building 
707 S. Mathews Ave. 
Urbana, IL 61801 

voice 217)-333-9776 
fax (217>244-0190 
email hart@uxl.cso.uiuc.edu 
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I Spelling and word order msrkup utility, Version 3.0 ) 

( Implemented as HyperCard XFCH in Macintosh THINK PASCAL, Variion 4.0.2 I 

I Robert S. Hart OI/PC Language Learning Laboratory 13 April 1994 1 

( Copyright 1M3-4 by Robart S. Hart. ) 



Spelling markup dona by dynamic programing algorithm which generates a markup ) 
corresponding to a least-cost editing trace. Editing operation! are restricted 1 
to omission of a letter, imertion of an extra letter, aubititution of one ) 
letter for another, or transposition of two adjacent letters. } 

Capitalization and secant errors sre also identified and marked. The user nay 1 
specify the way in which capitalization disagreements will be treated: exact 1 
agreement required, capital required if the model has one, or capitalization ) 
differences ignored. ) 

Run-together words sre identified as such if they are adjacent in the model. 1 
Some misspelling is tolerated in one or both run-together s. ) 

Order analysis identifies extrs words, missing words, and misplaced words. 1 
The user can specify vsrious degrees of tolerance when defining what constitutes ) 
a match of the model: spelling errors can bo excused; incorrect word order can ) 
be excused, and extra words in the response can be excused. ) 

The order analysis returns three goodness of fit measures: proportion of ) 
matched words, proportion of words in correct order, and average amount of J 
misspelling per matched word. ) 

When specifying a correct answer, the author is allowed to specify one or 1 
more words which will bo ignored if they occur in the student's response. 1 
Such a word or list must be surrounded by angle brackets, < >. A list of J 
"synonyms" (i.e., a set of words any one of which would be correct at a ) 
given position in a sentence) must be surrounded by square brackets, [ ) . ) 



s 

I 
1 

I 

I 
I 



LIMITATIONS: 



1 



Version 3.0 will not <ork properly with the Macintosh 16-bit char representation, 

i. e. , with the language extensions. ) 

Maximum number of letters in a single word: 22 1 

Maximum number of words in model (including synonyms but excluding ignorables) : 18 ) 
Maximum number of words in response: 18 I 
Maximum number of cheracters in model: 255 ) 
Maximum number of characters in response: 255 } 



XFCH INPOT PARAMETERS: 1 

Parameter number and description are show in left-hand coluw:.. ) 

Permissible values are indicated in the right-hand column. Value preceeded 1 

by asterisk is the default value assigned if the parameter is left empty. ) 



1- 


model 


< string of 255 chars. 


aax> 


) 


2 - 


response 


<stringof 255 chars 


, max> 


1 


3- 


cap_f lag 


•"exact_csse" 1 "authors_caps" 1 


■ignore_case" } 


4- 


extraNordsOK 


True 1 "False 




1 


5- 


anyOrderOK 


True 1 "False 




1 


«- 


misspellOK 


True 1 "False 




1 


7- 


wordMar kOpNeeded 


"True 1 False 




1 


8- 


runTogether_needed 


"True 1 False 




1 


9- 


sd just_needed 


•True 1 False 




1 


10- 


shortcut 


"True 1 False 




1 


11- 


markupnapsNeeded 


True 1 "False 




1 


12- 


parameterDisplayNeeded "<empty> 1 "x" 1 "n" 






13- 


rawTraceNeeded 


•<empty> 1 "x" 1 "r" 1 


■P" 


1 


14- 


debugNeeded 


True 1 "False 




1 



I 
I 
I 



These HyperCard global vara may be used to input additional information: ) 

theMarkOpPunctuation ) 
theMarkOpSymbols ) 
theMerkOptfeights ) 
theMarkUpPhonMatrix ) 
theMarkOpCherlnfo ) 



XFCN RETURN VALUES: ) 

(Direct fen return value) : markup string 
Returns in HyperCard globals: 

theMarkOpReturnValues : OK/NO boolean, plus other judging flags 

theMarkupMaps: markup maps for R-TO-M on line 1, and M-TO-R on line 2 

theMarkupParaaDisplay: display of the requested judging table values and parameters 



(if requested) ) 
(if requested ) ) 



I 
I 
1 
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theMarkupDebug : display of requested debugging information ) 

THINK PASCAL Veraion 4.0.2 project requires the following f ilea , in thia order : ) 



) 



DRVTtRuntiM.lib 
Interface. lib 
HyperXCmd.p 
HyperXLib.lib } 
MarkOp*TC*3.p (thia file) 



1 



REV I SON HISTORY FOR 3.0: } 
5 April 94 - RSH } 

Fixed problem in WORD_MARKUP which kept spelling 



arka from being displayed when NOORDEROK waa in effect. ) 



11 April 94 - RSH } 

Rewrote HARK_SBNTENCE , which would destroy left substring of markup when a word with final missing letter (s) ) 
was followed by a Jti aaing word carat. Alao rewrote Dt7P_CHAR for greater efficiency. ) 

26 May 94 - RSH ) 

Fixed DUP_CHAR ao that it returns null string when char count is <- 0. ) 

Fixed markup display so that a blank moveword won't shadow spelling Markup on first char of word. I 
Edited SETJ3IACRITO initialization ao that Swedish I, A are aaaigned "supero" dlacrit. ) 

27 Nay 94 - RSH J 

Decoupled internal markup symbols fro* us«r-specified symbols to prevent confusions when user specifics } 

weird or ambiguoua aymbola. Internal symbol nones begin with S (e. g. "Sextraletter") . Also created an ) 

array SYMBOLMAP to map internal chara to uaer chars ■ Transformation to external symbols done In CAPMARK ) 

and SPELLMARKS, and SENTENCEMARKUP . Alao changed internal diaplay logic ao that any uaer symbol set } 

equal to "nomark" (a blank space) will be "transparent" -- any symbols it normally shadows will appear properly. ) 

16 Sept 94 - RSH } 

Rewrote code which computes NED (normalized edit distance): ) 

Supposedly 0 £ NED £ 1, but in fact NED > 1 sometimes because normalizing tern for RED was based on the "average" \ 
cost of an edit operation. Now uses maxEdlt Height, computed from w change and the phon_matrix values at time phon_raatrix is ) 
initialized. Normalization done on basis of max poaaible cost to convert a string of responses' s length into one of model's 
length. ) 

Now 0 S NED S 1 Is guaranteed. ) 

Rewrote code which convert a NED to SNED (integer scaled normalized edit distance) : ) 

Small NEDS became 0 when scaled to SNED integers for the sim[] array, becauae acale factor was too small, so significant t 
dlglta remained fractional and were loat In truncation. Made SIM [ ] and other varlablea that hold acaled NEDs Into LONCINT } 
and increased acale factor to 10000 so that even very small fractiona have an integer representation. Both infinity and } 
editffeightScale are now conatanta and are uaed everywhere. MaxCost has been eliminated. } 

Fixed logic error in SPELLMARKS which failed to reaet preemption flag and thua dropped all marks after the first ) 
preemptive mark. ) 

Edited Phon Category assignments so that upper case vowels are counted as vowels aa well as lower case ones. ) 
1 November 94 - RSH ) 

Moved large judging matrix MARKS to dynamic memory ao that code would not take ao much room on HyperCard ) 

atack. Replaced by new ptr MARKS P of type LSMATRIXP which is used to point to a new handle. Handle is disposed ) 

before exit. ) 

30 November 94 - RSH ) 

Fixed error which caused the runtogether word analyaia to overwrite prwious matchinga, imposing a new bogus ) 
match on worda which already had imperfect single-word match. Introduced new sets MMAYBE and RMAYBE to keep ) 
track of M and R positions which have any kind of match, and uaed it to make aure that already matched M worda ) 
are not grabbed by the runtogether analyaia. Now only M worda which have no potential match at all can become ) 
candidatea es runtogether word match. Teat caae is ) 
Model: Yesterd he } 

Respononse: He yesterd ) 
Markup: S [\ J 

Where "yesterd" haa been interpreted aa a runtogether of "Yesterd he" ) 
"he" is set aa ignorable, which leada to a P MATCHED of 4/3 ) 



Alao rationalized computation of PMATCHED, which is now {matched words/ total # words in M and in R; i. 
2*MATCHEDK / [ (CARD ( M \ - CARD ( MIGNORE )) + (CARD (R) - CARD (RI CHORE) ) ] ) 



) 



20 December 94 - RSH ) 

Edited set_cap_info and aet_phon_inf o in create__cha rotables ao that Courier upper-case accented vowels would be correctly ) 
identified aa upper-case and aa vowela. } 



-J 



E 
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unit narkUpXFCM; 
interf a&e 



KyperXOd; 



procedure main (parajiPtr: XQndPtr) ; 



( FORWARD ) 



implementation 



BEST COPY AVAILABLE 
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procedure main (paramPtr: XCmdPtr ) ; 



vnax 

9999; 



Markup XFCN 3.0 - 19 Dee 94, 8:20PM - O Robert S. Hart - UI/UC Language Learning Laboratory' 
{ max number of letters In a word } 

{ HARMING: wordjaax must be larger than or equal to wmax ! } 
{ max r positions In model and words In resp after processing ) 
t max f of words In model at Input } 



0; 



const 
verslonStr 
lmax - 22; 

wiux - 18; 
word_max - 
Infinity - 
space ■ ' ' 
spaces - ' 
editWelghtS 



( Internal markup symbols } 



Saddcap - ' + •; 
Sdropcap - ' ; 
Saccentarr ■ '-' , 
Sextrawd « 'X'; 
Smlsslngwd - "A" 
Smovewd » ' « • ; 
Sextraltr - 'x' ; 
Smissingltr - «\ 
Ssubstltuteltr - 
Stransltrl - '>* 
Stransltr2 - '<• 
Srunonwd - ' t * ; 
Snomark - ' ' ; 



{ plusslgn ) 
{ downarrow ) 
{ tilde ) 
{ captial X 1 
t capital delta ) 
solid leftward arrowhead 
{ small x } 
{ backslash ) 
{ equal sign ) 
{ right engle bracket 
{ left angle bracket } 
{ left square bracket 
{ underscore ) 



letterErrors 
letter > 



[Snomark, Saccanterr, 



'D'J; I Raw markup chars used to Indicate case/accent errors or no error on 



type 

lnputwrange - l..word_max; 
wrange - l..wmax; 
1 range - -1 .lmax; 
wordstr - st;rlng[lmax] ; 
itr80 - strlng[80] ; 

wlvector - errey [wrange] of INTEGER; 
lnputwlvector - array [lnputwrange] of INTEGER; 
wsvector - array [wrange] of wordstr; 
lnputwsvector - array [lnputwrange] of wordstr; 
dlacrlt_varlants - (no__accant, acute, grave, circumflex, 
superhat, subhook, macron) ; 

phon_yarlanta - (vowel, consonant , phon3 , phon4, phonS) ; 
cap_f lag_type - (exact_case, authors__cap», ignore_cas«) ; 
ca server lant a - (up__cese, down__case) ; 



diarsls, umlaut, supero, cedilla, tilde, subdot, auperdot, subhat. 



wrange] of INTEGER; 

wrange] of LONG I NT; 

lrange] of INTEGER; 

lrange] of wordstr; 



wordset - set of wrange; 
charrange - 0..255; 
wtmatrlx - array [wrange, 
wlmatrlx * array [wrange, 
limatrlx - array [lrange, 
lsmatrlx " array [lrange, 
lsmatrlxPtr - "lsmatrlx; 

pmatrlx ■ array [phon_variants, phon_varlants] of INTEGER; 
clvector - array [CHAR] of INTEGER; 
ccvector - arrey[CHAR] of CHAR; 
cholcellsttype - array [wrange] of wordset; 
solutionrec - record 

seq: strSO; 

lnverslonk: INTEGER; 

first lnv: INTEGER; 
end; 

lset - set of 0. .255; 



var 

p: Ptr; 

h, IsmH: handle; 
rw. 



wordmark: lnputwsvector; 

rwxloc, 

»_to_r, 

r_to_m: lnputwl vector; 
runt ©get her : wlvector; 
pnon Inversions, 
pmatched, 
cutoff, 

prop_errors , 

runon_crlterlon : REAL; 

avedlst: EXTENDED; 

wlnsert , wdelete, wchange 
wtranspose, waccent, wcap 
model, response: string; 
cep_flag: cap_f lag__type; 



{ vector of model words ) 
{ vector of response words ) 
{ vector of word markups } 

{ vector of response word locations ) 
[ response wd matched to given model wd } 
{ model wd matched to response wd } 

{ Index of 2nd run-together model wd ) 
{ proportion non-Inverted words I 

( proportion words matched ) 

I spell check applied If length ratio] 
of 2 words falls below this value ) 
{ If edit dlst between M and R word] 

exceeds this, then round to Infinity J 
{ spelling match necessary to consider) 

response word as run on ) 
averege edit distance between matched wds) 

averaged over all matched pairs ) 
{ weights of various spelling errors ) 
raaxEdltWelght : INTEGER; 

{ correct ans and student response 1 
{ tells how to handle wrong cap letters ) 



( 



ft 
I 
I 
1 



I 
I 
1 
I 
I 



a 
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judgedok, 

mlsspellok, 

extrawordsok, 

anyorderofc, 

runtogether__ne*ded, 

w©rd_markup needed, 

adjuat_needed, 

me r kupMa ps Weeded , 

shortcut, 

trece: BOOLEAN; 

pa ramDlaplay Needed, 

rtwTrica: CHAR; 



{ rtturni Ok or No for response ] 
{ judge rasp with misspellings Ok ) 
{ judge rasp w extra word a Ok ) 

{ j'jdge rasp w words out of ordar Ok ) 
{ enable/dlaable runtogether analysis ) 
( whether to generate sentence markup ) 
{ whather to adjust for optimal solution ) 
{ whether to return Markup neps Hats to HyperCerd ) 
{ whether to shortcut when computing edit dlstence of very dissimilar words ) 
i eneble/diaable tracing output } 
{ which data atructure info to return to HyperCard J 

( whether edit trace string returned should be "raw" or prettied up for display ) 



ncetark, addcep, dropcap, { markup symbols ) 

accent err, axt ~wd, mlsaingwd, movewd, extra It r, missing ltr, substituteltr, transltrl, transltr2, runonwd: CHAR; 
dellm_chara : sot of CHAR; { punctuation symbol s ) 

Rwlg, rwlg : inputwivector; { lengths of model, resp words ) 

mwseq: inputwivector; { map of model word index to model position index i 

edltd: liaetrix; { matrix of normalized edit distences between word substrings } 

merkaP: lamatrlxPtr; 

a: wlmatrix; 

aseq: wiaetrix; 

rignore, mignore, matched, mmetched, rmeyfoe, meybe: words et; 
dr : clVector; 
symbolMap: ccVector ; 
choices : choicelisttype; 

solutionli at : errey ( wrung e] of solut ionrec; 

rig, nig, pig, rawk, aolutionk, edlt_dist, metchedk, rightmost, recursion*, solutions_tried: INTEGER; 
mincost: LONG I NT; 
runtogether flag: BOOLEAN; 
time: REAL; 

{ Judging tablea used to control capitalization/diacritic judging ] 

base_char: array [CHAR] of CHAR; 
dlacrit_info: array [CHAR] of diacrit _var iants; 
cese_lnf o: array [CHAR] of ceae_verlent s; 
phon_inf o: err ay [CHAR] of phon_ve riant s; 
phon_»etrlx : piaat rix; 



1 



I 



1 
1 
I 
1 



UTILITY PROCEDURES 



Return ERRMSG es the XFCN's return velue, and immedietely exit the XFCN . ] 

procedure FAIL (errHsg: Str25S); 

begin 

if marks? <> nil then 

disposPtr [PTR (rearksP) ) ; 
{.aramPtr* . returnValue : - PasToZero (peramPtr , errHsg ) ; 
EXIT (Main); ( exit XFCN ] 
end; { FAIL ( 

{ Convenience proc for returning string value VALUE in a HyperCard global var GLOBALNAME. t 

procedure returnlnClobal (globelNeme, velue: str2SS); 

ver 
h: handle; 

begin 

h : - pasToZero (pararaPt r, value) ; 
if h - nil then 

FAI L (concat ( 1 %Out of memory for return in global *, globalName) ) 
else 
beg in 

setGlobal (paremPtr, globalName, h) ; 
dl sposHandle (h) 
end 

end; { returnlnClobal ) 
{ — ——AppendSt ringToHandl a ] 

{ Append string 5 at the end of the infometlon pointed to by HANDLE. } 

procedure eppendStrlngToHandle (h : HANDLE; s : string) ; 

var 

r: OSErr; 

errType: strlng(20] ; 
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begin 



r :- ptrAndHand(Ptr(ORD(e») ♦ 1) , h, LanqthUI); 

If r <> 0 than 
begin 
case r of 
manKullErr: 

•rrType 'Memory Full'; 
nilHendleErr: 

errType 'MIL handle* ; 
memWZErr: 
errType :- 'Mem block i» free" 
and; 

FAIL(Concet('%AppendStringToHendle irror: ', errType)) 
end; 

•nd; ( eppendStringToHendle } 

— — AppandStringToGlob&l } 

procedure eppendStringToGlobel (gName: atr255; a: atr255); 

ver 

h: HANDLE; 
lg: INTEGER; 
hp: PTR; 

begin 

h :- getGlobal (paraniptr, gNarae); 
lg : - CetHandleSize (h) ; 

hp :- PTR(ORD(StrlpAddress(h") ) • lg - 1); ( Ptr to last byte of block. ] 

if (lg > 0) ( (hp" - 0) then { If handle ii non-nil and ha» null char terminator, ] 

setil.indleSize (h, lg - 1); I remove null char terminator. ) 

appen.'StringToHandlelh, concat(», CHR(O))); ( Append string plu» null char terminator. 
setGlobal (paramPtr, gNarae, h) ; 
disposeHandle(h) ; 
end; 

NtoS 1 

function NtoS (nujt: INTEGER) : str255; 
beqin 

numToStr (paramPtr, num, NtoS) ; 
end; { NtoS ) 

function LtoS (lng: LONGINT) : str255; 
begin 

longToStr (paramptr, lng, LtoS); 
and; ( LtoS ) 

EtoS ] 

function EtoS (r: REAL): Str255; 
begin 

extToStr (paramPtr, r, EtoS) ; 
end; { EtoS ) 

function BtoS (b: BOOLEAN) : »tr255; 

begin 
if b then 

BtoS 'True" 
else 
BtoS :- "False" 
end; { BtoS I 

function setToString (»t: i»at) : »tr255; 

ver 

1: INTEGER; 
s: Str255; 

begin 
» : - ' '; 

for 1 :- 1 to 30 ao 
if 1 in »t then 

a :■ concat (s, 1 1 , 1 ) 
else 

s !« concat (s, '0,'); 
setToStrlng :- s; 



end; { setToString ) 



I 

I Convenience function to return case-lnsensltlve • quality of two strings ) 
•q (si, s2: str25S) : 



-Eq I 



•1, s2) 



•nd; I eq } 



Nthchunk I 



1 Return the Nth chunk of string S, where a chunk la a substring lying between ) 
{ two DELIM characters (beginning t end of S are implicit delimiters) . ) 

function nthchunk (s: str255; a: INTEGER; dchar: CHAR): str255; 



var 

I, p: INTEGER; 
del 1m: string; 

begin 

delis I- dchar; 
for 1 :- 1 to n - 1 do 
begin 
p :- pos(dalim, s); 
if p > 0 then 

delete!., 1, p) 
else 
begin 
nthchunk : - " ; 
EXIT (nthchunk) 
end 
end; 



{ remove first n - 1 chunks from string ( 



I If less than n-1 chunks, return EMPTY ) 



p :- pos Idelijn, s) ; 
if p > 0 then 

nthchunk :- copy(s, 1, p - 1) 
else 

nthchunk :- s; 



I Nth chunk is now at front of list I 



( nthchunk I 



procedure inc (var x: integer); 

begin 

x : - x + 1 ; 
end; ( inc I 



procedure dec (var x: integer); 

bog in 

x : - x - 1; 
end; ( dec ) 



-Max ) 



function aax (x, y: INTEGER) : INTEGER; 

begin 
if x > y then 

else 



:- y 
( ma 



) 



function mln |x, y: INTEGER) : INTEGER; 

begin 
if x < y then 

min : - x 
else 

min : - y 
end; ( mln ) 



function dup_char (c : CHAR; lg: INTEGER): string; 



var 
i; INTEGER; 
s: string; 
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baqln 
if lq <- 0 then 
begin 
dup_char I* 1 ' ; 
EXIT (dup_char) ; 
•nd; 

if (lq > 255) then 

lq :- 255; 
• :- c; 

for i 1 to 8 do 
if length <•) < lq then 
if lenqth(a) >- 128 then 

a :- Concet (a. Copy ( » , 1, lq - lenqth (•) ) ) 
■la* 

n :- Copy (Concat (a, a), 1, lg) 
elaa 
begin 

dup_char a; 

axit (dup_char) 
end; 

and; { dup_char ) 

( Card } 

{Compute the cardinality of a aet of type •wordJat'.l 
function card (aetofworda: wordaet) : INTEGER; 
var 

i, k: INTEGER; 

begin 
k :- 0; 

for 1 :■ 1 to wreax do 
if (i in aetofworda) then 
Inc (k) ; 
card :- k; 
end; { card } 

( > 

( DEBUG I/O ) 

procedure ahowaet (a: atrinq; it: 1 aet ) ; 
begin 

appendStrinqToGlobal CtheMarkUpDebuq' , concat (a, • - ' , setToStrinq (at) , CHR(13) ) ) 
and; 

{ — — — — — — — —-se*__n«dlt_raatrlx ) 

procedure aee_nedit_matrix (s : atr255) ; 
var 

a, r: INTEGER; 
beqin 

appendStrinqToGlobal CtheMarkUpDebuq', concat (CHR (13) , 'A[R,M1 : a, CHRU3M); 

for r :- 1 to rlq do 
beqin 
a : - * 1 ; 

for m :- 1 to mwk do 

a !- concat (a, LtoS (a (r, it]), ' 1 ) ; 
appendStrinqToGlobal ( • theMarkUpDebuq • , concat CR-', NtoS(r), ' a, CHR{13))); 

end 

end; ( aee_nedit_natrix } 

( DISABLED I/O ) 

( Following procedurea di tabled becauae MAC code reiourcea cannot have standard I/O ) 

procedure pauae; 

beqin 

(readln; ) 
end; 

procedure clrScr; 

baqin 
end; 

{ 1 

( INITIALIZE ALL STATIC DATA STRUCTURES JUDGING TABLES AND PARAMETERS . I 



SI 

I 



i 



i 

i 
i 
i 



i 
i 



o 
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— Cet_char cat* ) 



function get_char_case (i: INTEGER) : iase_yariants; 
begin 

if Ch*r(i) In fA'.-'Z'} th*n 

get_char_case :■ up_case 
alae 

get_char_case :■ down__cat* 
•nd; { get_char_caae ) 



-Fore* down cat* } 



function force_down_case (1: INTEGER): CHAR; 
begin 

If get_char_case (i) - upbeat* th*n 

forc*~down~cat* :- chr (ToRD ( 'a' ) - ORD('A')) + i) 
•Is* 

f orce_down_case :- chr(i); 
• nd; ( force__down case ) 



-Set base info ) 



procedure set_base_char (c: CHAR; • : at r60) ; 

var 

1: INTEGER; 

begin 

for 1 :- 1 to Length(s) do 
if s(i) <> space then 
base_char [s [1] 1 :■ c; 
end; { set_base_char ) 



-Set dlacrlt Info ) 



procedure set_dlocrit_lnf o (d: diacrlt_va riant s; s: strBO) ; 

var 

1: INTEGER; 

begin 

for i :■ 1 to Length (s) do 
if s(i] <> space then 
diacrit_info[s[l] J :- d 
end; { set diacrit info ) 



procedure tet_cap_lnfo (c: casejvariants; s: strBO); 

var 

1: INTEGER; 

begin 

for 1 :- 1 to Length(s) do 
if s(i] <> space then 
case_info{s(i] ] :- c 
end; ( set_cap_lnfo 1 



-Set_cap_lnf o | 



-Set_phon info ) 



procedure set__phon_inf o (p: phor»_varianta; s: st rSO) ; 

var 

1 : INTEGER; 

begin 

for 1 :- 1 to Length (s) do 
if s(i] <> space then 
phon_infots[l] ] p 
end; ( set__phon_ info ) 



-—Create char info tables ) 



Initialize all character information tables. These tables provide ) 
descriptive information about each of the 255 characters in the Mac ) 
character set used for the model and response. ) 

Global data structures affected;) 

base_char : vector specifying the base (unaccented) char corresponding} 

to each char . ) 

diacrlt__lnf o : vector specifying the type of diacritic marJc which) 

modifies each char) 

case_lnfo : vector specifying the case (upper or lower) of each char.) 

phon_info : vector specifying whether each char is vowel or consonant.) 
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procedure creata__char_inf o_tables; 



var 

i, lineNo: INTEGER; 
c: CHAR; 
a: atrflO; 
•tr: Str255; 
h: handle; 



bag in { create_char_info — tables ) 
for i :- 1 to 255 do 
begin 

c :- CHAR(i) ; 

baa«_chartc] CHAR(force_down_case (i) ) / 
caa«_info[cl get__char__caso (i ) ; 
diacrit_info[c] :- no^accent; 
phon info[c] consonant 
end; T DO | 

{ Replace b*se_char default value for accented chare. Chan with accenta have ) 
{ the unaccented veriion aa baae cher. Unaccented chara have themselves aa baae I 
{ char (thia ia the default caae) . } 



{ NOTE: Thei« settings assume that the font ia COURIER or a one compatible font!! ) 
{ They nay not display properly here in a font other than courier. ) 

tet base char ('a', 'i A « A i H A i U A'); 

iet~ba«e_char (•••, E * E A E 4 EM; 

set_base~char('i\ 'I I 1 1 1 H !')» 

set basecharCo', 'fl 0 6 6 6 0 6 6 6 0'); 

set~base_char fu\ 'QOuOoOuO'); 

set_base_char (• y • , • y ?'); 

set_base_char ( 'c' , *c CM; 

iet_baae_char ( 'n 1 , 'ft f)M; 
{ Enter proper diacritic information for accented chars. Unaccented chars have ) 
{ O no_accent" as their diacritic. Accented chars are assigned the proper accent } 
( mark. These settings assume COURIER font, and may not display properly in 1 
{ another font, } 

set_diacrit__info (acute, • a A e £ i t 6 0 u 0' ) ; 

set__diecrit~info (grave, ' a A 4 E i t 6 6 u 0* ) ; 

set~diacrit_info (circumflex, 'i U £ 1 1 0 fl 0'); 

iet__diacrit_info(diarsis, 'i H i 6 I 1 i 0 Q 0 H').' 

set_diacrit_inf o (supero, 'A k* ) 3 

set_diacrit_info (cedilla, 'c CM; 

set_diacrit~info (tilde, 'n ft a A o 0'); 

set_diecrit_info(autcron, **)t { IBM PC had son* ucron chars 1 

( Enter case info for upper case accented letters . This supplements the default ) 
( assignment of "upper_case" to A..Z. - COURIER font. I 

set_cap_info(u P _case, •AEl0uAElC-uSAEtO0A0AEtOUA£(Efi3NCM; 



( Set phon info for vowels. This overrides the default setting of "consonant". } 

( Specify both upper and lower case, separately for chars with diacrits. - COURIER font. 

set_phon_info (vowel, 'aeiouyAEIOUYaeaiaM; 

set_phon_info (vowel, 'aei6ua«I6uyaei6ua6ael6u'); 

set _phon_info (vowel, 'A fil <J D A E I 0 0 U fi i 0 OA 0 A £ t 0 0 A £ 2 0'!; 

{ Set cap info for accented chars - COURIER font 1 
end; { create_char_info__tables ) 

{ OverrideCharlnfoTables 1 



( Take a line of char info specs and install them in the proper char info table. ) 



procedure OverrideCharlnfoTables ( specs : Str255> ; 



var 

i, n: INTEGER; 
switch: char; 
d: diacrit_variants; 
c : case_variants; 
p, q: ph on_va riant s ; 
s, ch: Str255; 



begin 

switch :- specs (1 ] ; ( Specifies typn of info. ) 
delete (specs, 1, 2); (leading char and following comma) 
1 :*» pos ( * , ' , specs); 
if 1. - 0 then 

FAIL(Concat (*%Hissing comma after switch in judging table line: •, specs)); 
ch :•• Copy (specs, 1, i - 1); ( Specifies base char or variant value for following list, 
if ch - ' ' then 

FAIL( • iMlssing base char or variant specifier.'); 
delete (specs , 1, i) ; 
case switch of 

*b* : ( base char ) 
set_base_char (ch, a) ; 

•d' : T diacritic information } 

begin 

if eq (ch, 'acute') then 
d acute 
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•lie if eq(ch, 'gravt' ) then 

d grave 
•1b* if eq(ch, 'circumflex') then 

d circumflex 
•1b* if eq(ch, 'diersis' ) than 

d dlarsis 
else if eq(ch, 'supero') then 

d i- supero 
•1b* if eq(ch, 'cedille') then 

d :- cedille 
•lft* if eq(ch, 'tilide') then 

d tilde 
•1b* if eq(ch, 'Micron') than 

d : - *acron 
•lia 

FAIL(Concet('%Bad diacritic variant velue: ch)); 
set_diecrit_info (d, b) ; 
end; 

'C : { capitalization information ) 
begin 

if eq (ch, *up — case') then 

c :» up_case 
•1b* if eq(ch, 'dowi^cait') than 

c : - down_cese 
else 

FAILtConcet (• %Bad cap variant velue: ', ch, ' specs)); 
eet_cep__info (c, s) ; 
and; 

'p' : { phon information ) 
begin 

if eq(ch, 'vowel') then 

p vowal 
else Lf eq£ch» 'consonant') than 

p !- consonant 
else if eqfch, 'phon3'( then 

p !- phon3 
else if eq(ch, 'phon4') then 

p !- phon4 
else if eq(ch, 'phonS') then 

p phon5 
else 

FAIL (Con cet (' %Bad phon variant value: ch)); 
set_phon_info (p, s) ; 
and; 
otherwise 

FAIL (concat ( ' %Bad judging table switch: ', switch)); 
end; { CASE switch OF ) 

end; ( overrideCharlnfoTables ) 

{ — — — __-_.—____—_._—_____.___.._...„ — _-»-„___-__-_-_ — _ — — _ — — — ~-lnit_raarkup ) 

Set values for program parameters and data structures. ) 

First look to see if values have been provided in these 5 global variables: ) 
theMarkupPunctuat ion ) 
theMarkupSymbola ) 
theMarkupWeights ) 
theMarkUpCherlnfo ) 
theMarkUpPhonMatrix J 
If so, process those values to set the date; otherwise use default velues. ] 

procedure ini t_mar kup; 

{ SetDelimitera ) 

{ Specify characters which will serve as punctuation in modal end response. ) 
( If there ere date in the global variable 'theMarkupPunctuetion' , use there. ) 
{ Otherwise, set values below as default values. ) 
{ Chr(13) is MAC/HyperCard RETURN char, which starts new line. I 

procedure setDelimiters; 

ver 

h: handle; 
•: str255; 
i: integer; 

begin 

h :* getGlobal (pararaPtr, ' theMarkupPunctuation" ) ; 
zeroToPas (paranPtr, h", s) ; 
diiposHand] e(h) ; 

if s - ' ' then 

dellm_chara :- [« \ «.«, «,«, ' ; « , >(', •)', '('# MS '<'# '>', '?'■ '!', Chr(13)] 

else 
begin 
delim_chars :- [] ; 
for 1 1 to Length(s) do 
d«lim_chars d«lira_chars *■ [s[i}]; 
end 

5o 
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•nd; ( aetDelijelters ) 

{ Specify characters which will serve a* punctuation in modal and response. 
( If there irt data in the global var •theHarkupSymbols' , use thaw. ) 
{ Otherwise, set dafault valuea betlow. } 

procedure setSywbols; 



( uparrow J 
( downarrow ) 
{ tllda } 
( capital X > 
{ capital dalta i 
{ solid laftward arrowhead ] 

{ small x i 

{ backslash I 
{ equal sign ) 
{ right angle bracket ) 
( left angle bracket ) 
{ left square bracket ) 



var 
h: handle; 
1: INTEGER; 
Ss, s: str255; 

begin 

h getGlobal (paramPtr, ' theMarkupSyitbols ' ) ; 
xeroToPas (paramPtr, h" , s) ; 
diaposHandle(h) t 

If s - 11 then 
begin 
addcap :- *>" ; 
dropcap :» 
acccnterr : — * — '$, 
extrawd : - 'X* ; 
nlssingwd ' A ( ; 
moverwd : ■ ' « ' ; 
extraltr 'x'; 
raissingltr !■ ' \ ' ; 
substltuteltr :- i 
trsnsltrl !- ■>■; 
transltr2 :- '<'; 
runonwd :** • [ • ; 
end 
else 
begin 
addcap s [ 1 ] ; 
dropcap : - s[2]; 
accent err :- s [3] ; 
extrawd : - s[<] ; 
miisingvd s- s [ 5 ] ; 
movewd : - s [6 ] ; 
extraltr :- s[7] ; 
misalngltr :- a[flj; 
substltuteltr :- a[9); 
transltrl « CIO I ; 
trsnsltr2 a [ 1 1 ] ; 
runonwd :- a (121; 
and; 

s :- concat (addcap, dropcap, acemterr, axtrawd, nlssingwd, movewd, extraltr, raissingltr, substltuteltr, transltrl, transltr2, 
runonwd) ; 

Ss :- concat (Saddcap, Sdropcap, Saccenterr, Sextrawd, Smlsslngwd, Smovewd, Sextraltr, Smlsalngltr, Saubstltuteltr, Stransltrl , 
Stranaltr2, Srunonwd) ; 

for 1 :- 1 to Length(Ss) do 

syT.DolHap[Ss[l| J :- a(l); 
nomark !«■ space; 
symboIHap[Snomark] !« nomark; 

end; ( sctSymbols ) 

( — Set_judglng_tables | 

procedura set Judging_tahles» 
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var 

h: handle; 

p: ptr; 

a: atr2S5; 

begin 

{ First, Initialize all judging tablea with default valuca. ] 

creat e_char_lnf o_tables; 

{ Override dafault valuea with uaer-apeclf led valuea In THEMARKUPCHMUNFO. ) 

h !- getGlobal (paramPtr, ' theMarkupCharlnf o' ) ; 
P :- h A ; 

diapoaHandle(h) ; 



while True do 
begin 
while p A - 13 do 

p :- PTRlORD(p) + 1) ; 
i f p A «0 than 
LEAVE .' 



( If at CR, move to next char ) 
( If line ia empty, akip over it.) 
{ If at end of atring, exit. ) 



5(i 



returnToPes (peramPtr, p, •) ; 
overrideCharlnf ©Tables (■) f 
scenToRetum (peramPtr , p) ; 
end; 



{ If real line, get it ) 
{ and install its values . 
{ Move to the next CR ) 
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end; { set__judging_tablea ) 
( — Set_phon_metrix ) 

( Specify cheracters which will serve as punctustion in model end response. 
{ If there ere date in the global var ' theMarxupSymbola * , use then. 



} 



{ Otherwise, set default velues below, 
procedure set _phon_matrix; 

ver 

h: handle; 
s: str2SS; 

p, q: phon_veri ants; 
1, w: INTEGER; 



} 



begin 

h :- getGlobel (peramPtr, ' theMarxOpPhonHatrix • ) ; 

zaroToPastparanPtr, h A , s) ; 

dlsposHandle(h); 

if s - then 

Put default values into the substitution weigh Matrix, pnatrix. ) 
For a given cell PHON_MATRIX [M, R] , M is the phonetic category of a model ) 
cherecter MC end R is the phonetic category of a response character RC. J 
The integer value in the cell is the weight attached to substituting RC for MC. } 
The defeult vslues below itquel to WCHANGE if MC and RC are in the sane category; } 
if they are in different categories, the ccst of a sustitution is 1.2 times WCHANGE . } 
begin 

for p t- vowel to phonS do 
for q :- vowel to phonS do 
if p - q then 

phon_matrix(p, q) :- wchange 
else 

phon_matrix(p, q] :- TR0NC(1.2 * wchange); 
maxEdltWeight TRWC(1.2 * wchange); 
end 
else 

( If the HyperCard globsl THEMARKUPPHONMATRlX is not empty, read values from it. ) 
begin 

maxEdltWeight : - wchenge; 

i 1; 

for p :« vowel to phonS do 
for q :- vowel to phonS do 
begin 

w :- strToNum (paramptr, nthChunxU, i, •,')); 
phon_matrixtp# q) w; 
maxEdltWeight max (mexEditWeight, w) ; 
inc (i) ; 
end 
end; 



end; ( set _phon__matrix ) 
SetWeights ) 



Specify weights and thresholds which control spelling analysis. ) 
Values must be contained in the global var THEKARKUPWEI CHTS, end } 
must eppear as comma- separated items, in this order: ) 

winsert , wdelete, wchange, wtrenspose, cutoff, prop_errors, runon_cr iter ion ) 

If any of these items is EMPTY, e default value will be used. If THEMARK0P WEIGHTS ) 

does not exist or is empty, all default velues will be used . ) 



procedur 



eights; 



var 

h: handle; 
v, s: str255; 



S 



begin 

These are 

that th* 
Also, the 
thet of e 
by 10 so 
winsert : 
wdelete : 
wchange : 
wtranspos 
waccent : 
wcap :- 1 



the default weights assigned to the various edit operations, chosen so ] 
cost of e change is less that that of a deletion followed by an insertion. } 
cost of a change, or of a deletion/insertion sequence is greater than ) 
transposition. The "atenderd" distances of 2,2,3,2 have been multiplied 
thet accent and cap errors can be scored et a lower velue. ) 

- 20; 

- 20; 

- 30; 
:- 20; 
U 



Ratio of word lengths must be nearer than this to 1 or the edit distance between J 
then will be automatically set to infinity (used only when 'shortcut' is TRUE). ) 



1 © 
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cutoff Q.tlt 

I Thia parameter control! the proportion of spelling edit* which cen occur when 1 
I attempting to natch two wordi before the two wordi will be considered non-metchea. ) 
prop_errors 0.35; 

( Thii la the max normalized adit dlatance which can exiat between 2 nodal worda m ♦ mm ) 
( and a reapona* word r bafora r can ba conaldarad to b* a and ma run together. ) 
runon_crlterion :«• 0.2; 

1 Override tha defaulta with valuoa in global varlabla THEMARKOPKEIGHTS . I 
h getGlobaKperamPtr, ' theMarxupWelghta • ) ; 
xaroToPa a (paramPtr, h*, a); 
dlapoaHandle(h) ; 

if a <> •• than 
begin 

v :- nthchunk (a, 1, ','); 
if v <> • ' than 
winaart etrToNua (paramPtr, v) ; 

v nthchunlt (a, 2, ','); 
if v <> " than 
wdeleita etrToNum (paramPtr, v) ; 

v :- nthChunkle, 3, 1 , 1 ) ; 
if v <> " than 
wchanga atrToNua (paramPtr, v) ; 

v : « nthChuntc {», 4, ' , * I ; 
if v <> " than 
wtranapoaa : - BtrToNum(peranPtr, v) ; 

v :- nthchunlt (a, S, ',•); 
if v <> " than 
wcap l» atrToNua (paramPtr. v) ; 

v :- nthChunlt(a, i, ','); 
if v <> " than 
waccant : - BtrToNum (paramPtr, v) ; 

v :- nthChunkle, 7, • , • 1 ; 
if v <> " than 
cutoff :- atrToExt (paramPtr. v> ; 

v :- nthChunkle, 8. ' .'): 
if v <> " than 
prop_errora :«• atrToExt (paramPtr, vl ; 

v :« nthChunkle, 9, ','); 
if v <> •• than 
runon_critarlon l- atrToExt (paramPtr, v) ; 

and; 

and; I aatNalghta ) 

bagin ( lnltjaarkup ) 

( Spacify cap, accent, vowal, and baae-char propertlaa of chara. 1 
( Firat aat defaulta, than look for valuaa in global vara. ) 
aet_Judging_teblea; 

( Spacify which input chara will aarvaa aa 'punctuation' (word daliaitara) . ) 
eetDellmitere; 

( Spacify which aynbola to uaa for markup diapley. ) 
cat Symbol a; 

( Set nuiaarical waighta and thraaholda which control judging proccaa. I 
aatNalghta; 

I Sat valuaa in PHON__MATRI X , which datarminea aubatitutlon coat for varioiua ) 
( combination* of character catagoriaa. Alao coatputea mexEditWelght . ) 
aet_phon_j*at rla ; 

and; ( init_markup } 
( INPUT PARSING I 

( Segment_strlng 1 



ERIC 
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Process the »od«l (correct answer) - string and the (student's) response strings } 
and puts them into en internal format suitable for further procesaing. } 
String is segmented into words and the total number of wordj r n well as the } 
length of each word, is recorded. While breaking out individual words, ell } 
extraneous characters — extra spaces end punctuation — are discarded. } 

If string is e model, the speciel syntex of ingorable words and synonyms is } 
interpreted, and a list (in set format) of ignoreble word poetione is built, } 
as well es information on which words ere synonyms end which sequent i el ) 
position each words occupies in the sentence. All the synonyms in e group } 
share the same sequential position. ) 

Special syntex for correct answer: ignorable words are placed within angle ) 
brackets, and synonymous worda within square brackets, eg: ) 

The quick < brown > fox [ jumped leaped ] over the < lazy > dog ) 



Input vers: ) 

c : string to be processed (correct answer or response) } 

ismodel : True if string is a correct answer; false if it is student response. ) 

Return vars: J 

wk : number of worda } 

w : vector of words (strings) ) 

wig : parallel vector of word lengths (char counts) ) 

waux : If string ia model, } 



( position number of each word (all the synonyms in a list share the ) 

{ same postion number) , or, ) 

( if string is response, ) 

{ column location of leftmost letter of each word (entire response } 

( is assumed to be on a single screen line. ) f 

( mignore : list (in set format) of word sequence numbers to ignore } 

procedure segment_st ring (var wk: INTEGER; var c: string; var w: inputwsvector; var wig, waux: inputwi vector ; ismodel: BOOLEAN); 

var 

i, p, position, .lg: INTEGER; 
x: wordstr; 

syn_list, ignore_flag: BOOLEAN; 
begin 

c :- concat (c, ' ' ) ; 
lg : - Length (c) ; 
wk :- ex- 
position :- 1; 

syn_list :- False; ( Turn on when processing synonym list. ) 

ignore_flag False; ( Turn on when processing ignorable word list. } 
if ismodel then 
mignore s- [ J ; 

( Take succesaive chars from string to build next word. If word has become too ) 
( long, set error flag and exit. ) 

i :- 1; 

while i <- lg do 
begin 
x :- 1 * i 
p :- 1; 

while (i <- lg) and not (c[ij in del im_chars) do 
begin 

x :« concat (x, c [ i ] ) ; 
Inc(i) 
end; 

{ If word not null, then update word count, word vector, word length vector. J 
if x <> ' ' then 
begin 

( If sentence would heve more than the allowable number of words, or if the } 
( current word is too long, set error flag and exit. } 

if (wk + 1) > word_max then 

FAILCIToo many words in input.'); 

if Length (x) > lmax then 

FAIL (concat (' %Input word too long: x)); 

Inc(wk) ; 

w[wk] :- x; 

wlglwk] :- Length (x); 
{ If string is model, also update synonym list, ignorable word list, and map } 
( of model word numbers to model positions. } 

if ismodel then 

begin 

if ignore_flag then 

mignore mignore + [position) ; 

waux (wk ) : - position; 

end 

else 

waux[wk) !- p; 
if not syn_list then 
Inc (pos ition) ; 
end; 

( Process delimiters trailing at end of word, including ignorable and synonym list 1 
( delimiters. If one of the latter is encountered, set or clear the appropriate ) 
( flags. ) 

while (1 <- lg) and (c(l) In delim_chars) do 

BEST COPY AVAILABLE 
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begin 
if ismodal than 
casa c[i) of 
•f! 

syn_list True; 

']*: 

begin 

syn_li»t Falsa; 
Inc(poaition) ; 
end; 

' < ' : 

ignora_flag :- True; 

•>* : 

ignora__f lag : ■ Falsa; 

end; (CASE) 

Inc(i); 

and; ( WHILE ( i <- lg ) AND ( c[ i ] IN dalim_chara ) } 
and; { WHILE i <- lg ) } 
( Claan-up coda for end-of-»tring condition. If number of raaponaa words lass 1, ) 
( or number of positions in nodal , axcaeds tha dimansion of tha (squara) sin ) 
{ matrix, tha sat arror flag and axit. Otharvisa stora tha numbar of positions in ) 
( tha modal, if string is modal, or the column number of tha first character } 
( beyond the end of tha rasponsa ( used latar for markup) . ) 
casa ismodel of 
Trua: 

if position > wmax then 

FAIL{ "%Too many word positions in modal. 1 ) 
else 

pig :- position - 1; 
Falsa: 

if (wx + 1) > wmax then 

FAIM *%Too many words in input.') 
alsa 

wauxtwx + 13 :- Langth(c) + 1; 
and; (CASE) 

and; { sagment_string ) 

procadur a sagmantModal ; 

var 
i: INTEGER; 

begin 

for i :- 1 to word_max do 
»w[ij :- '*•; 
segment_string (mwk, modal, raw, mwlg, rawsaq. True); 
end; { sagmantModal } 

procadura sagmantRasponse; 

var 

i: INTEGER; 



bag in 

for i :- 1 to word_max do 
rv[i) :- '*•; 

segment_string (rig, rasponsa, rw, rwlg, rwxloc. False); 
end; ( sagmant Rasponsa ) 

( SetModel } 



procadura sat Model (s: string); 

begin 

modal !- s; 

segmentModel; 
end; ( satModal 1 

{ -__«™._— ————— — — — — -SatResponaa ) 

procadura satRasponsa (s: string); 

bag in 

response : ■ s; 

segment Rasponsa; 
and; ( satRasponsa ) 



{ SPELLING ANALYSIS } 

( ) 

( 1 nit_»pelling ) 

ERIC 
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Initialize *11 natrix d<e...s structures used by the dynamic programming algorithm ) 
which ginintu s 'nearast natch" Misspelling markup. These data structures ) 
arc all global. The (global) vara affected arc: } 
editd : Matrix of (minimal) adit distancea ) 

Mrki : parallel natrix of (minimal) markup corraapondlng to aach adit distance 

procadura init_apalling (var marks: lamatrix); 

i: INTEGER; 



bagin 
editdCO, 0] :- 
marka [0, 0] :- 
for 1 :- -1 to 
bagin 
narkaCi, -1] 
marks[-L, il 
editd[i, -1J 
editd[-l, i] 
and; 

for 1 :- 1 to Imax do 
bagin 

editd[i, 0] 

aditdCO, 1] 

narkatir 0] 

marka [0, 1] 
and; 

and; { init_spelling } 



do 



- infinity; 

- infinity; 



- editd [i - 1, 0] 

- editd[0, i - 1] 

- concat (marka [1 - 

- Jtiaaingltr; 



- wdeletes 

y win aa rt; 
1, 0], axtraltr) ; 



( Return Syatem markup char for capitalization and/or accent error. Character M, ) 

{ assumed to be from the model, la conpaired to character R, assumed to ) 

{ be in the atudent's response. If R haa both a cap error and an accant } 

{ error, the cap error takes precadenca. } 

function capHark (m, r: CHAR) : CHAR; 

var 

mease, rcase: case_va riant s; 
mark: CHAR; 

bagin 

mease : M case_in fo[n] ; 
rca sa :• case_in fo[ r 1 ; 

mark :- Snomark; [ Default is no ra^rk } 

if fcap_flag <> ignore_casa) and (mcasa <> rcase) then 
case cap_f lag of 
exact_case: 
if rcase - down_c&se then 
mark :- Saddcap 
else 

mark Sdropcap; 
authors_caps : 
if mcasa - up_case then 
mark : - Saddcap; 
and; {CASE] 

( If case is ignored or ok, or user's case mark is blank, then check to sea if accents match. ) 
if (mark - Snomark) i (dlacric_lnfo[m] <> dlacrlt_info[r] ) than 
mark :- Saccenterr; 

capMsrk :- mark; 

end; ( capMerk ) 



-AccentError 1 



Compair the two chars M and R. Assign score HCAP for a cap error, W ACCENT for ) 

an accent error, and return the total score. C returns the type of error (s J : Snomark for none, } 
'u* and 'd* for case error only; for accent error only; '0' and 'D* for both case and accent error. 

The datailed info is useful only for return to user when raw markup string is requested; ) 
CAP MARK regenerates it during word markup. } 



function accentError (m, r: CHAR; var c: CHAR): INTEGER; 
var 

errK: INTEGER; 
begin 

( See if cases match. 1 

If (case_info[m] - csse^lnfo [r ] ) I (cap_flag - lgnore_case) I ( (cap_f lag - authors_caps ) k (case_lnf o [ r ) - up_case) ) then 
bagin 
errK :- 0; 

c : * Snomark { No case err } 
end 
else 
begin 



r 



:RIC 



bi 
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•rrK wcap; 

If case__in.fo [«] - up_ca«e than 

c 'u 1 ( Case error } 

else 

q ;» 'd'i ( Case error J 

end; 

( Also check to *** if accents match. ) 
if diacrit_info[m] - diacrit_info [r J than 

accantError :« arrK 
also 

bag In 

accantError :- arrK + waccant; 
if c ■ Snomark than 

c :« Saccenterr ( Accent arr only ) 
else if c ■ 'u' than 

c :- '0' 
alaa 

c t m 'D' { Case and accant arr ) 
and; 

and; ( accantError ) 

( SpellMarks } 

Converts the "raw" •palling markup returnad by the least-distance algorithm to ) 
a markup suitable for display: (a) When two letters match, checks case/accent } 
and generates tha casa/accent markup, if any; (b) reduces a sequence of emission marks, } 
■\", to a single omission mark; (c) supreasas any markup of letter following an } 
omission, since tha omission marker m \ M occupies tha space beneath the next } 
character following the omission; (d) substitutes a blank space for "-" as an } 
indicator of properly matched letters. } 

function spellMarks (var marks: wordstr; var m, r: wordstr) : wordstr; 

const 

preamptives «■ [Smissingltr, SrunonwdJ ; 
var 

i, j, k: INTEGER; 
mc, usermark: CHAR; 
preempted: BOOLEAN; 
markup: wordstr; 

begin 

i :- 0; 
j :- 0; 

preempted . « False; 
markup : " • ' ; 

for k :- 1 to Length (marks) do 
begin 

Inc(i) ; 
Inc( j); 

mc : - merksCk} ; 

if mc in letterErrors then ( case or accent or no error ) 

mc :- capnark(nCi), r[jj) { system error char } 
else if (mc - Sextraltr) then 

Dec (i) 

else if (mc in preamptives) then 
Dec(j) ; 

I If the user symbol for missing wd or runtogether is space, do not preempt spelling mark. ) 
usermark :» symbolMapCmcl ; 
if not preempted then 
if (mc in preamptives) then 
if (usermark - nomark) then 
pr eempt ed : - Fa 1 se 
else 
begin 

preempted : - True; 

markup : - concet (markup, usermark); 
end 
else 
begin 

preempted :- False; 

markup :- concat (markup, usermark); 
end 
else 

f If preemption is on, skip current markup char, but turn off preemption to eccept successive ones. 1 
preempted : - False; 

end; (FOR) 

spellmarks : ■ markup 

end; ( spellMarks ) 



© be' 
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» N«dit_di»t } 

Computes normalised minimal spelling edit distance between two strings R and M and } 
{optionally) the markup string which corresponds to that edit distance. } 



Input vars: 

r 

m 

markf lag 

shortcut 

Return vars : 
Markup 
nedlt_dlst 
edit dist 



) 

: Single word from response string } 

: Single word from model (correct answer) ] 
:TRUE if markup corresponding to edit distance is to be returned; ) 
FALSE if no markup string needed. } 
:TR0E if words of too different length will be given distance infinity; 1 
FALSE if exact distance must be computed. } 

) 

i Markup string (if requested). ) 

: Normalized edit distance ( betweeen 1 and 0 ) . } 

: (GLOBAL var) : weighted, unnomallzed edit distance. ] 



function nedit_dist (var r, m, markup: wordstr; marks: lsmatrlx; markflag, shortcut: BOOLEAN) : REAL; 
var 

1, j, flag, x, x2, x3, x4, d, ml, rl» db, 11, jl: INTEGER; 
ratio: REAL; 

c, lastc, mc, rc, nk: CHAR; 
t: wordstr; 

begin 

runt og ether flag : - False; 

( If doing standard order analysis, handle seme special cares. } 

( If raw trace was requested , go directly to produce minimal trace. ) 

if rawTrace - »x' then 
begin 

( If the two words match exactly, return adit distance of 0. } 
if » ■ r then 
begin 
nedlt_dist :- 0; 
edit_dist :- 0; 
markup :- •'; 
EXIT(nedit_dlst) ; 
end; 

ml !■ Longth (m) ; 
rl Length (r) ; 

( If word lengths vary too much, and shortcut flag is set, skip further ] 
( Analysis and return infinite edit distance. } 
if shortcut then 
begin 
If ml < rl then 
ratio ml / rl 
else 

ratio :« rl / ml; 

if ratio < cutoff then 

begin 

nedit_dist :- infinity; 
•dit_di.it :- infinity; 
markup :- •'; 
EXIT(nedlt_dist) 
end; 

end; { IF shortcut ) 
end 

else ( rawTrace - 'r' or 'p'. } 

begin 
markup :- 1 1 ; 
ml : - Length (m) ; 
rl :- Length (r) ; 
end; 

( Otherwise, compute the edit distance between the two words using dynamic ] 
( programming algorithm expressing recursive relation between left substring } 

distances. This is a form of exhaustive search, implemented here by iteration ) 



( rather than true recursion 



( Initialize temporary memory array. ) 

( dr [ch] will store location in resp where char ch last appeared. 

for 1 1 to 255 do 
drCchr(i)] 0; 

{ Main loops to fill matrix of substring edit distances. ) 
for i :- 1 to rl do 
begin 
db :- 0; 



E 



for j : ™ 1 to ml do 




begin 




mc :- m[ j] ; 




rc :- rCH; 




il :- dr {bate char [mc] ) ; 


[last occurence of mc in rasp) 


jl :- db; 


(last matched char in 



Check for identity or substitution of end chars in each string. 



r ° 

'eric 

ummmmmim 



If irc - rc then 
begin 

d :- 0; (ditt bttwMn chart rc and rc ) 

db :- J; (model poaltlon of last metched char) 

mk :- Snomerk; 
end 

•It* If baae_ehar[i»c) - baae_char[rc] than 
begin 

d :- accentError <mc, rc, mk) t ( raturn d and mk, la in _-udOd ) 
db :- j; 
and 
elae 
begin 

d :- phon_m«trlx[phon_lnfo[mc|, phon_ln f o [ rc I ] ; ( aubat dlat for typa lie and type rc) 
mk :- Saubatltuteltr; 
and; 

( Find coat of matching via omlaalon, lnaartlon, aubatltutlon, and tranapoaltlon 
x :- edltd[l -I, J - 1) + d; 
x2 :- editd[l - 1, J) ♦ wlnaert; 
x3 :- editd[l, J - 1] ♦ wdelete; 

x4 :- editd[ll - 1 , jl - I J + (1 - 11 - 1) * wdalata ♦ wtranapoae ♦ (J - Jl 

( Select the natch which ylalda leaat coat. Start by aai-uming omlaalon (xl) . ) 
flag :- It 
If x2 < i. than 
begin 
x :- x2; 
flag :- 2 
and; 

If x3 < x than 
begin 
x :- x3; 
flap :- 3 
end; 

if x4 < x then 
begin 
x : - x4; 
flag :- 4 
end; 

edltd[l, J) :- x; 

1 if markup return la requeated, generate markup aflng for thla char pair. 1 

{ When marking an omlaalon, uae apeclal mark for omlaalon of apace, which indicate* 1 
( run together worda . } 

if (flag - 3) and (»(J) - apace) then 

runt oget her flag True; 

if narkflag then 
caae flag of 

1: 

marka[l, J) :- concat (marka [ 1 - 1. j - 1). »k); 

2: 

markatl, J) :- concat <marka[l - I, j), Sextraltr); 

3: 

if »[j) " apace then 

marka[l, J) :- concat (marka[i, j - 1), Srunonwd) 
elae 

merka[l, J] :- concat (marka [ 1, J - 1), Saiaalngltr) ; 
4: 

mark«[l, jj :- concat (markat 11 — 1, jl - 1), Stranaltrl, dup_char ( ' • ' , 1 - 11 - 1), Stransltr2) 
end; ( CASE flag OF) 

end; ( FOR J DOI 

dr[baae_char[r[l) ]] :- 1; 

end; I FOR 1 DO) 

( Minimum weighted, unnormallzed edit dlatance la now in lower right entry of edltd matrix. ) 
( Thla number lncludea welghta due to accent and caae errora. Save it in global var edlt_dlat. I 
edit_diat :- editd[rl, ml]; 

( Get ( return normalized edit dlatance, nedlt_diat, by dividing maximum total coat of converting 1 
( a atring of length rl to one of length ml. ) 

nadlt dlat :- edit_dlat / (maxEdltWelght • minimi, rl) + wdelete * (maxlml, rl) - minimi, rl))); 

{ Return merkup atring. Will be null If none waa generated in loop above. ) 

If rawTrace - 'r* then 

markup :- marka [rl, ml) ( Return raw edit markup. I 

elae 

markup :- apellmarka (marka [rl, ml), m, r) ; { Return "pretty- markup aultable for dliplay. } 
end; ( nedlt_dltt ) 

I Edlt^trace) 

( Top-lavel control to generate a leaat-coat edit trace or. the atrlnga MODEL and RESPONSE. ) 
{ Form of trace (raw or pretty) la controlled by global var RAWTRACE valuea 'r' or -p'. ) 



) 



- 1) * wlnaert; 
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{ Edit trace string 1» put Into direct HyperCard XFCN return. Indirect return It put Into ) 
[ the HyperCard global var iH EMA RK UP RETURNVM, UE S : two comma sopareted itui, ) 
( the first the raw e "t distence, and the second the normelixed edit distance. ] 

procedure edit_trece (model, response: st r255) ; 

ver 

ned: REAL; 

c: CHAR; 

si, s2: atr255? 

m, rs, narks : wordst r; 

begin 

If ( length (model) > lmax ) I (length (response) > liux) then 

FAIL(»%Input string length exceeds max word length. 1 ) 
else 

begin 
c : - • , •; 

ns copy (model, 1, 25); 
ra copy (response, 1, 2 5); 
init_spelling(*arksP A > ; 

( Return markup string In MARKS and weighted unnormellzed edit dlst in EDIT_PIST. ) 
ned :■> nedlt_dist iv&- t ms, merks, raerksP*, True, False); 

{ Convert so markup string Is direct return; NEDIT_DIST and EDITDIST are returned In ) 
( THEMARKUP RETURNVALUES . J 

extToStr (peramPtr, ned, si); 

numToStr (paranPtr, editdiat, s2) ; 

si :« concatfsl, c, s2); 

paramPtr* .ReturnVelue :- pasToZero (pararaPtr, marks); 

setclobal (paraaPtr, 'theKarkUpRet urn Values' , pasToZero (peramPtr, si ) > ; 

end; 

end; ( edit_trace ) 

< ) 

{ WORD ORDER ANALYSIS ) 

( ) 

( Get least edit distance between each pair of words (M, R) where M Is taken from ) 

{ the correct answer and R cones from the student ' s reapom*. The procedure ) 

[ takes synonyma into account, so that the synonym H «/ith minimal edit distance ) 

1 from e given R is used to determine edit distance. ) 

{ The normalized edit distance for each pelr Is computed. If this distance Is I 

( greater than a certain criterion, prop_arrors, then the two words are considered ) 

( to be different and an "Infinite" distance assigned. If prop_errors Is less ) 

( than the criterion, then the actual distance Is assigned, scalec " "> make It an I 

( Integer between 1 and edltWeightScale . ) 

procedure flll_editd_matrix (var mw, rw: inputwsvector; var rawseq: lnputwl vector; var slm: wlmatrlx; var slraseq: wlmatrix) ; 

var 

m, rap, r, p: INTEGER; 
d: LONG I NT; 
ned: REAL; 

spellmarks : wordst r; 
begin 

( Create a list (set) of response words which exactly match lg cor able model words. ) 
rlgnore :- []; 
for m :- 1 to mwk do 
if m in mignore then 
for r :- 1 to rig do 
begin 

sim(r , mwseqtn] } :■ infinity; 
if Mw[m) - rw(r] then 
rlgnore :- rlgnore + [rl 
end; 

( Look et each response word In turn. ) 
for r :- 1 to rig do 

( if the response word is ignorable word. It cannot match eny model position. ) 
if r in rlgnore then 
for rap :- 1 to mwk do ( Exclude ell model positions. ) 
simtr, mp) :- Infinity 
else 

for » 1 to mwk do 
begin 

( Using m as an index for POSITION, initialize distance between each response ) 
( word and model position to the default of "infinity". Since mlg will always ) 
( exceed the number of positions, this will initialize all positions, and some | 
( cells beyond that. I 
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aim[r, ») :» infinity; 
sl*seq(r, n) :- -1; 

{ Gat the edit distsnce between the currant response and nodal words. Tha ) 

t spellmarks ar* not actually returned hara, but a psrameter la raquirad at that t 

t poait ion . } 

nad :- nedit_dlst (rw [ r] , nw[n], spellnark* , nsrksp*. Palaa, ahortcut); 
[ If tha normal izad adit diatance exceeds a critarion valua, than round } 

( it to Infinity; otherwise, convart It to an Integer between 1 and edltWelghtScsle. Shortcut 1 
{ nay ba uaad — words differing too auch in length aaaigned infinite distance. ) 

if nad <- prop_errora then 

d s- Trunc (editWeightScale * nad) 

alae 

d :- infinity; 

( Gat tha poaltion in the model which tha current nodal word correaponda to. } 
{ (aeveral aynonyma nay ahare tha aarac poaltion, but note that p will elways be } 
{ leas than m, hence a call referenced by p will already have bean initialized. } 
p :- mwsaqfa) ; 

( If responae word la like lgnorable model except for cap and accent errora, then I 

t Ignore It and make It unmatchable with every model poaltion. MP la model poaltion Index. ) 

if (m in nignore) then 

if (adlt_diat < winaert) then 

begin 

rignore :- rignora + Ir) t 
for mp i- 1 to nwk do 
■in[r, «p) infinity; 
end 
alae 

[ If R isn't close to ignorablo M, leave diatance Infinite. ) 
I Otherwise, If the currant diatance la smaller than what la already entered In thla cell, ) 
t then replace the cell contents with the new diatance, and keep track of the model ) 
{ word num>M»- used to fill this nodel position. This Means that when there are several ) 
{ synonvAs occupying a single position in the pattern, that the nodel word ultimately ) 
{ used will always be the ona which is closest to the response word, and the distance I 
{ entered In the cell will alwaya be the nlninun possible for the given synonyas J 
{ liat. Thia ia eaaential for proper handling of aynonyma. The actual model word ) 
{ watched nuat be remembered ao that an appropriate spelling Markup can be } 
( generated later. } 

else If d < sin[r, p] then 

begin 

slm[r , p] :- d; 
simseqlr, pi :- m; 
end; 

end; ( FOR n :- 1 TO mwk; ELSE; FOR r :- 1 TO rig J 

{ The matrix has the dimensions rig X pig. Initializations of other cells are bogus. ) 
if traca then 
see_nedit_matrixrEnd of FILL_ > EDITD_MATRIX ' ) ; 

end; [ fill_editd_natrlx J 

( List_j>ossible_natches J 

{ Generate a list of possible nstch*s, in set format, for each response word ) 
{ position. } 

procedure list_possible_natchaa; > 

var 

r, ib: INTEGER; 

begin 
matched : ■ [ ] ; 
mnatched :- []; 
rma yb« : - [ ] ; 
nnayba : - [ 1 ; 
for r :- 1 to rig do 
begin 
choicea [r ] :- []; 
runtogether [r] -1; 
for n t ■ 1 to pig do 
if a[r, m] < infinity then 
begin 

choiceatrl :-cholceaCr) + [n\ ; 
maybe :- maybe + [r]; 
nmaybe : - ramaybe + [n] ; 
if a[r, n] - 0 then 

begin 

nmatched nnatchad ♦ (re); 

matched :- matched + (rl; 

end; 

end; 
end; 
if trace then 
begin 

showiet ( * matched' , matched) ; 
showset ( *ramatched* , mraatched) ; 
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showset ( ' rignore' , rignore) ; 
show-set ( 'mignore' , Mignore) ; 
for r :- 1 to rig do 
ihowitt (concat ('choices R- ' 
and; 

■nd; { list jpossiblejsatches ) 



Nto5(r)), choices[rl) ; 



Find__runtog ether J 

This procedure called after the initial pass at spell Hatching has been done. ) 
For each unmatched word in r, all adjacent UNMATCHED pairs of positions, m, mm, } 
in the Model are examined in turn to see if r matches the concatenation of m ) 
with im (possibly allowing for scute Misspelling) . This Means that run-together ) 
words are identified as such only if they sppear in the exact order specified ) 
by the Model. When forming the pairs m, mm, the following compl ications } 
must be taken into account: } 

1. Ignorable words in the model must be left out of consideration when ) 
determining adjacency, so that m and ntra will be considered adjacent if } 
separated by nothing but ignorable words. E.g., in a < b c > d, a and d are } 
adjacent. ) 

2. Synonym lists must receive special treatment. Suppose two adjacent ) 
synonym lists Cab] [ d e f ], with each of then unmatched (i.e., no r ) 

has matched any synonym at either position) . Then r must be compared to every ) 
member of the csrtesian product of the two lists: ad, ae, af, bd, be, bf. } 
Likewise, for a [bed], r must be compaired to ab, ac, and ad. ) 

procedure find_runtogether; 

vor 

r, m, mm, xm, xram, p, pp: INTEGER; 
unmatchedm, unmatchedr : wordset ; 

label 

It 



-Next_position ) 



{ Seach through list of model words and return index of first word right of 'start' ) 

{ which (a) is not an ignorable word, and (b) is not a synonym of the word at ) 

{ 'start'. If the word found is part of a synonym list, it will always be the first ) 

( member of that list. If no word of this sort can be found beyond 'start', return 0. 

function next_position {startingmw: INTEGER) : INTEGER; 

var 

1, lastp: INTEGER; 
begin 

next_position :« -1; 
if startingmw - -1 then 

EXIT (next jposition) ; 
lastp :■ mwseq{startingmw] ; 
for i i" Succ (startingmw) to rawk do 
if (mwseq(i) <> lastp) and (mvseqCi) in unmatchedm) then 
begin 
next_position :■ i; 
EXIT (next jaositlon) ; 
end; 

end; ( n«xt_position ) 



I 



„ Try_to_split - rw ) 

Compare response word r to the run-together string consisting of model words ) 
m and mm. If spelling analysis yields an edit distance of less than spl it_criterion, 
then (a) match r with the word at mm; (b) mark the "run cn" portion of r which ) 
corresponds to the word at mm as ignorable, so that it will not be marked up } 
as a missing word; (c) remove both p and pp from the list of unmatched positions ) 
so that they will not be matched to some other r; (d) add p to the list of choices 
available to assigning to r during the order analysis. No markup and no short- ) 
cut used when computing edit distsnce. ) 



function try_to_splitjrw: BOOLEAN; 

var 

d: REAL; 

dummy, runtogetherword: wordstr; 



begin 

try_to_split__rw :■ False; 

runtogetherword :■ concat (mw[xml , space, mw [xmm] ) i 

d nedit_dist (rv(r) , runtogetherword, dummy, raarksP" 



False, Fal se) ; 



if rtintogetherflag and ( (edit_dist 
begin 

try_to_split_rw : - True; 

a [r, p] :- Trunc (editWeightScale 

aseq[r, p] :- xm; 

runtogether ( r 1 :- xmm; 

choiceslr) :- choices[r] + [pi; 

mignore :- mignore * [ppj ; 



wdelete) or (d <- runon criterion)) then 



!er|c 
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unmatchedm : - unmatched)* * [p, pp] ; 
unmatchedr unmatchedr - (r); 
end; 

•nd; I try_to_split_rw J 

begin { f ind_runtogether } 

( Get set of unmatched ■ positions (not matched and not lgnorable) . Also set of 1 
{ unmatched r words • ) 

unmatched* :- [l..plg( - mm it y be - mignore; 

unmatchedr :- [1 . . rig] - rmsybe - rignore; 

if trece than 
bag in 

■how set < 'unmetchedia :', unmatchedm); 
showset {'unmatchedr :', unmatchedr); 
•nd; 

for r :■ 1 to rig do 
begin 
runtogether[r} :- 0; 
if r in unmatchedr than 
begin 

Gat pairs of adjacant modal positions, p, pp. In determining adjacency, ) 
ignorable words are neglected, and a synonym list counts as one position. ) 
If tha next position is a single word, then ' nextjposition ' returns the index J 
number of that word; if it is a synonym list, then it returns the index number ) 
of tha first word in that list. When there are no further such pairs, cither 1 
m or mm will be returned as 0. } 
m :- nextjposition (Ot ; 
mm :- next_poslt ion <ra) ; 
if mm - -1 then 
BXIT<find_runtogether) ; 
while mm <> -1 do 
begin 

Gat the position numbers of the words. } 
p : m arwse-q(mj ; 
pp mwscqlmm} ; 

Verify thst both positions are still unmatched. If not, they cannot be } 
matched against tho possibly run-together r, and we must move on to the next ) 
pair of m. ma. ) 

if [p, pp] <- unma.'chedm then 
This double loop takes care of cases where either m or mm, or both, head a ) 
synonym list. In this case, r must be test d against all combinations of words ) 
w, ww, where w ia drawn from the synonym lint headed by m, and vw is drawn from } 
from the list headed by mm. In the common case where neither m nor mm heads a } 
synonym list, each loop executes once only. The loops operate by starting at J 
m <ma) and advancing rightword one word at s time to the end of the synonym list, ) 
signaled when the position number associated with the current word changes. } 
begin 
xm :- a; 

while p - mwaeqCxm] do 

begin 

xmm : - ma; 

while pp - mwsaqtxm] do 
begin 

{ Here rw[ r ] is tested for a match with the run-together string mw[ xro } + raw[ xmra ] . 
{ If the natch succeeds, then exits the m, mm loops snd start work on the next r. } 

if try_to_spl lt__rw then 

goto 1; 

Inc (xmm) ; 

and; 

Inc<xm) ; 
and; 

end; I IP [ p. pp I <- unmatched™ J 
{ Move right to next pair of adjacent model positions. } 
a next ^position (m) ; 
am nextjposition (m) ; 

end; ( WHILE ( m > 0 ) AND ( mm > 0 ) } 
and; { IF r IN unmatchedr ) 

1 : 

and; { FOR r :- 1 TO rig DO I 
( Update list of matched response words snd model positions. } 
msMtchad s» [1 . .pig] - mignore - unmet chedm; 
matched [1 . . r lgj - rignore - unmatchedr; 

end; ( flnd_run tog ether } 
( —— ----- -Search_«equencei ) 

Core procedure of the order analysis; actually produces an optimal metching of } 
model and response words. Optimal means that the match returned is at least as ) 
good as any other match which could be generated. If A and B are two raatchinqs, ) 
A is defined to be better than B if (i) A has less inversions than B, or, if ) 
A and B have the same number of inversions, (ii) the final inversion is as ) 
far to the right as possible. Consider this example, where a, b, c ... symbolize ) 
full words: ) 

Model word: a b c a b c ) 
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Reaponte word: 
Position number: 



For «tch response word, several model words, at different positions in the) 
aod*l, uy ■fctChil 

Response position: 1 2 3 4 5 £ ) 
, 

Matching models words: 



2 3 4 
5*1 



2 3 4 

5 i i 



A matching is generated by choosing ona of the numbers (i.e., model words) ) 
in each column, subject to the restriction that no number be chosen twice. | 
Possible matching s era indicated in the table below. Of course, a model and ) 
response word cart only be metched If they are sufficiently similar (ideelly, | 
identical) . ) 



Response word position: 



One possible matching: 2 3 4 5 6 1 

A second metching: 5 6 12 3 4 

A third matching: 2(1534 



I 



An inversion occurs whenever two successive numbers invert their natural order; e.g., J 
the first matching has one inversion, at 6-1. The third has two: at} 
6-1 and at 5 - 3. | 

The algorithm aanerates all possible matching* in a depth-first Manner, } 
moving forward in the sequence of response-word positions by recursive | 
descent, counting the number of inversions along the path as it goes, and ) 
keeping crack of the position of the rightmost inversion. When a matching is ) 
complete, the algorithm checks to see if it is at least as good as the matchea | 
generated so far and, if so, saves it in a list of solutions. ) 

In the worst case, where there were M response words and H model words, All ) 
identical, there would be N candidate words to fill each response position, } 
and hence N! paths to check, leading to a near-exponential algorithm. In fact, | 
the algorithm turns out to be fairly efficient, for several reasons: ) 

1. In actuality, even for fairly pathological cases such as cyclic and near- I 
cyclic patterns, there are relatively few choices for matching at each response | 

position. } 

2. The algorithm does extensive tree pruning. As soon as it becomes clear that } 
a path cannot be optimal (because the number of inversions has exceeded the ] 

minimum so far found), work is immediately terminated on that path and all | 
subpatha. This drastically reduces the search space. In practice, it appears ) 
that search time is roughly quadratic. ) 



Input parameters: ) 

ran a in ing_choi ces ) 

A list (in set forma 
not yat been matched 
matching. ) 

solution A record contain 

far developed, inclu 
inversion count , and 

lastchosen Position of last 
matches a response, 
to tha solution sequ 
prior velue in this 
not be made against 
p Response word posit! 



t) of all the model words which have ) 
a Ad thus are still available for ) 

ing a description of this matching as so ) 
ding sequence of model word numbers, ) 
position of rightmost invarsion. ) 
model word chosen. When no model word ) 
an arbitrary value of '0' is assignad ) 
ence, but ' lastchosen 1 retains its ) 
case, so that the inversion count will) 
"extra" words . ) 

on at which matching should take place. ) 



(Global) data structures affected:) 

solutionlist A list of optimal solutions, each solution is a record) 
containing the actual matching ( the sequence into which) 
the model words must be rearranged to match the response),) 
the number of inversions, and the position of the rightmost inversion.) 

solution* Number of records on the solution list (starts with ID.) 

procedure search^sequences (remaining_choices: wordset; solution: solutionrec; lastchosen: BYTE; p: INTEGER); 
var 

chosen : wrange; 
xsol: solut ionrec; 
available_choices: wordset; 

i: INTEGER; 



-Choose next ) 



( Chooses the first (leftmost) of a list of words whose positions are represented ) 
{ in set format. ) 

function choose_next {want: wordset): wrange; 

var 

i: INTEGER; 
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begin 

1 :- 0; 

repeat 
Inc(l) 

until (i in witt); 

choose_next i 
•nd; { choose_text ) 

( Save_solutlon | 

{ Push** a solution onto the solution list (• stack) . ) 
procedure seve_solution (sol: solutionrec) ; 
begin 

If solution* < wmax then 

Inc fsolutionk) ; 
solutionlist [solutlonkj sol 
•nd; 

(_——————- Trace_solutlon 1 

procedure t recesolution; 

var 
i: INTEGER; 

begin 

(Write (• ' : 2 • p) ; ) 

{for 1 :» 1 to tengthtsolution.seq/ do} 
{Write (Ord (solution. seq[l J) : 2) ; ) 

(Wrltet 4 invJt-S solution .lnversionk : 2, • flrstlnv-', solution . first lnv : 2, ' p-', p : 2);| 
(KrltelnM 
end; ( trecesolution ) 

begin { search_sequences ) 

If trace then 
begin 

traces olut ion; 

pause; 
end; 

( This Is the termination clause. If we hsve run out of response words to | 

( itetch, this peth Is complete. Check the cost ( number of inversions ) in this ] 

{ matching, and if It Is more efficient than those currently on the stack, 1 

( then clear the stack and start it over with this solution; otherwise, t 

{ simply add the solution to those already present and exit, returning to earlier ) 

{ recursions. ) 

Inc (recursionk) ; 

If (p > rig) then 
begin 

If (solution. inverslonk < ralncost) or ( (solution, lnverslonk - mlncost) and (solution, flrstlnv > rightmost!) then 
begin 

mincost : ■ solut ion . lnvers ionk; 
rightmost solution . flrstlnv; 
solutions 0; 
save_solutlon (solution) ; 
end; 

EXIT (seerch_sequenc.es) ; 
end; 

1 Otherwise, we are still generating a path by recursion. Make a working copy ) 
1 of the solution, so that the partiel solution state will be preserved upon ) 
{ return. Find out what matching choices are available for this response word 1 
( by restricting the choices at thle position to those not used at previous ) 
i positions. If no choices are available, match this respose word to the dammy 1 
{ model word # 0 (this means treating the response word at this position as an ) 
{ extra word), and recurse to match the next position rlghtwerd. ) 
xsol solution; 

avellable_cholces (remelnlng_cholces • choices[pl); 
If evaileble_choicee - U then 
begin 

xsol . aeq : ■ concet (solution. seq, Chr (0) ) ; 

search_sequencee (. *telning_choices, xsol, lastchosen, p + 1); 
and 
else 

( If one or more words ere available for metch, choose eech of them In turn ) 
{ (this Is done iteratlvely using the WHILE loop) . Use this choice to extend the ) 
{ metching, updating the word sequence, the number of inversions, and the position ) 
{ of the rightmost Inversion. Notice that * availeble_choices' , which records the ) 
I choices not yet tried at this loop will shrink eech time though the loop, while } 
{ •remalning_choices», which record* words not yet entered into the match, will ) 
{ stay unchanged. ) 

while evaileblecholces <> [) do 
begin 
xsol s— solution; 

chosen :- choose_next (avallable_choices) ; 

a 
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available_choicea :» available_choicea - (choaen) ; 
x«ol . seq co neat ( aoj.ut ion . aeq, Chr (choaen) ) ; 
if choaen < laatchoaen then 
begin 

Inc (xsol . inversionk) ; 
if xaol . first inv » 0 then 
xsol.flrstinv p; 
end; 

{ If th« Hatching aa ao far developed ia as good or better than any aoluticn) 
{so far found, then continue thia Matching by recuraive descent to the next) 
(response word position. Otherwise, abandon this raetching, and do not try to] 
(extend any matches fro* thia point in the aearch tree (tree pruning). Simply I 
(continue the loop, chooaing another matching poaaibility at thia poaition, } 
(if any remain. Note that before it la paaaed to the next level, the aet of) 
( ' resMining^choicaa' nust have the current choice removed.) 
(N.B. - Still greater pruning efficiency could be obtained by retaining a) 
{aolution only as long aa it remained atrictly better than the current solutiona) 
(( inveraionk < aincoat ). Thia la net dona here so that the aecondary criterion) 
(of rightaoat final inver aions poaition can be applied. ) 

if (xsol . inver aionk < mincoat) or ( (xsol . inver vionk - mincost) and (xsol . f irstinv > rightmost)) then 
aearch_aequences (re»aining_choiees - [chosen], xsol, chosen, p + l)j 
end; { WHILE ) 

f If all the possibilitiea at thia poaition have been used, return to the) 
(previoua level of recursion and work on further poaaibilitiea there, generating) 
(new branches in the aearch tree. } 

end; ( aearch_aequencea } 



Adjust_solution ) 



The strictly left-to-right recursion of the matching algorithm unfortunately ) 
leada to aituationa like thia one: } 

Model : the time ) 

Response: tiraa then the ) 

Markup generated: * < x XXX ) 

More intuitive merkup: * XXXX < } 

This comes about becauaa the matching algorithm doaa not conaidar variationa in 
edit diatance when matching worda — it only knows that a pair ia or ia not a } 
permiaaible match. Hence a misspelled word is just as good a candidate aa ) 
a perfect match. When several response words match a given model word, the ) 
leftmost ia always ael acted in preference to the 'redundant* rightward versions, 
aven if the rightward veraiona are better spelled, and hence Intuitively better 
matches. E.g., in the response above, 'then' is alwaya aalected to natch 'the' 
in the model, leading to a counter-intuitive markup.) 

This procedure scans the solution, looking for cases where a rightward word ) 
would cotisititute a better fit than the current assignment, and adjusts the ) 
solution accordingly, producing, e.g., the improved, 'intuitive' markup shown ) 
ibove. Inversion count may be affected in cases lika this: ) 



Model: the time 

Response: then time 

Markup: x XXX 

'Improved' markup: XXXX* 



) 



) 



and it is not clear that this reslly represents an improvement. Hence, the ) 
inversion count of each adjusted solution is checked, and if the number of } 
inversions is increased, the proposed adjustment is not accepted. ) 



procedure adjust_solution (vsr sol: solut ionrec) ; 
var 

r, ri, m, mi, ulla, 1 11m, invComp: INTEGER; 
dll, dl2, d21, d22, edl, ed2: LONGINT; 
raave: CHAR; 
s: atrSO; 

atrgl, atrg2, st rg : str2S5; 
va 1 1 dwo r d s : wo rd se t ; 



-CompInvCount ) 



Compares the number of inversions in an old solution and a new solution, and ) 
returns -1 if the new solution has strictly less inversions than the old one, ) 
0 if the number of inversion is the sane, or 1 if the nev solution hss more ) 
The old solution is specified in the input par&mentera by a ptr into the ) 
aequence field of a aolution record, typecaat into a byte array. The proposed 1 
new solution is specified as e possible inversion, where the M belonging to r ) 
is to be exchenged with the M belonging to ri. ) 



function ccnpInvCount (s : strQO; r, ri: INTEGER): INTEGER; 

var 
xs: str80; 

i, k, xk, c, laati, laatxi: INTEGER; 
begin 

1 Build a sequence list for the proposed new solution. ) 
xs :- a; 
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xs[r| :- strtl; 
xs[ri] :- str]; 

{ Initialize inversion* counter* and bookkeeping for counting loop. } 
k :- 0; 
xk :- 0; 
laati i-O; 
lastxi 0; 

{ Count inveaiona in both the old end the propoaed new aolution in parallel . ) 
for 1 !• 1 to rig do 
begin 
c :- Ord(xa[i]); 
if (c <> 0) then 
begin 

if (c < laatxi) then 
Inc(xk) ; 
laatxi :- c; 
end; 

c :- Ord(a[il >; 
if (c <> 0) then 
begin 

if (c < laati) then 

Inc(k); 

laati :- c; 

end; 

end; 

{ Return boolean which tells whether the propoaed aolution is at least aa good ) 
( in tents of number of inversion*. ) 
if (xk < k) then 

conpInvCount :- -1 (new aol haa leaa Inveraiona} 
else if (xk - k) then 

conpInvCount :- 0 (new i old have same number of inversions} 

else 

conpInvCount :- 1; (old sol has less inversions) 

end; ( conpInvCount I 



begin ( ad just_solution ) 

( Build a list, in set format, of all unmatched response positions. ) 
s sol.seq; 

valldwords :- [l..rlg] - rignore; 

( Look in turn at each valid response word R, and the model word it is matched } 
( to, M. Look for another valid response word say R' matched to M' *uch that exchanging the J 
( metch, so that R is matched to M' and R' is matched to M would improve tJie overall solution, } 
( (because it either (a) causes no more inversions and improves tot si edit distance for sentence) OR, J 
( (b) causes less inversions, and does not increase total edit dls* ance. Whenever such R, R' can be found, ( 
( then exchange the match so R goes with M* and R* goes with M. } 
for r :- 1 to rig do 
if (r in valldwords) then 
for ri :« 1 to rig do 
if (ri in valldwords) then 
begin 

m :- Ord(s[rJ); 
mi :- Ord(s[riJ) ; 
if m - 0 then 
begi n 

dll :■ infinity; 

d21 :- infinity; 

end 

el se 

begin 

dll :» a[r, m] ; 
d21 :- atri, m} ; 
end; 

if mi - 0 then 
begin 

dl2 :- infinity; 

d22 :- infinity,' 

end 

else 

begin 

dl2 :- a[r, rail; 
d22 :- a[ri, mi) ; 
end; 

edl ;- dll + d22; 
ed2 :- dl2 + d21; 

invCorap :- compInvCount (s, r, ri); 

if ( (ed2 < edl) i (invComp <- 0) ) f ( (cd2 - edl) k (invCorap < 0)) then 

begin ( Exchange assignments so R <-> M' and R* <-> M. J 

rsave :» s [ri] ; 
s[ri) :- s[r]; 
s [ r) miv<; 
end 
end; 
sol . seq : - s ; 

end; ( ad Just_solution ) 
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( Find_beat_order J 

{ H*i overall control of the order analyaia. Accepts ■ Matrix of (normalized) } 

{ adit diataneea n Input, and returna • single opt iital matching aa • aolutlon, ) 

{ In tha form of a mapping of nodal to raaponaa word*. The mapping la repreaented ) 

( In tha vectora r__to__ m and (with Index variable Inverted) m_to_r. ) 

procedure f ind_b*st__order ; 

yar 
1: INTEGER; 
totDlat: LOKGINT; 
r f m, p: 0 . .wmax; 
sol : aoXutlonrec; 
rightmost: INTEGER; 
a: atrSO; 

begln 

( Initialize varlablea uacd by order matching algorithm, end call the algorithm ) 
( at top level of recuralon. Set of positions Initially available conslata of all } 
( poaitlona In the model. ) 

aol . aeq : - ' 1 ; 

aol . inveraionk 0; 

aol . flratinv : - 0; 

mincoat :- lnf lnity; 

rightmost : — 0; 

solution* 0; 

aolutlons_trled :- 0; 

recurs ion k :- 0; 

Bearch_Bequencea ( [1 . .pig] , sol, 0, I); 

( Adjust solution to Improve natch with respect to spelling accuracy. ) 
if adjust __needed then 
edjust_solution (solutlonllst[l) ) ; 

( Build a mapping of responsa-to-nodel and model -to-respon so words. These } 

{ will be used to generate the sentence markup. Unmetched words are assigned to ) 

( dummy word #0. ) 

s !— solutlonl 1st [1 ] . aeq; 
matched* :— 0; 
avedlat : - 0.0; 
totDlat 0; 
mmatched !- [J; 
rm itched [] ; 
for 1 !- 1 to wmax do 
begin 
it_to_r[l] :- 0; 
r_tojt[l] :- 0 
end; 

for r :- 1 to rig do 
begin 
it :- Ord(s[r]> ; 
r_to_m[r] it; 
if » > 0 then 
begin 

matched :- matched + [rl ; 
mmatched :■ mmatched + [raj ; 
n_to_r [m) :- r; 

totOiat !- totOiat * a[r, it); ( total scaled normalized edit dist ) 
Inc (matchedk) ; 
end; 
end; 

if raatchedk > 0 then 
avedist :- totDiat / (LONG I NT (editWeightScele) * LONGINT (matchedk ) ) ; 

( Compute proportion of matched words and proportion of non-inversions. These ) 
{ velues sre used to decide whether e response should be judged it e natch (ok) ) 
{ or a non-match (no) to the model pattern. ) 

1 card {[ 1 .. rig J - rignore) + cerd ( [ 1 . . pig) - mignore) ; 
if i > 0 then 

pmatched :- (2 * matchedk) / LONGI NT (1) 
else 

pmatched : » 1; 
if matchedk > 0 then 

pnoninveraiona : - 1 - {solution list [1 ] . inver slonk / matchedk ) 
• lie 

pnoninveraiona :- 1; 
end; { f ind_beat_order ) 

( Generate proper merkup for each word in Bent e nee. Order markup symbols } 

( (missing word, extra word, displaced word) are generated here. If a ) 

1 misspelled word is sis© out of order, a (non-blsnk) order nsrk preempts any spelling ) 

[ mark which might be on the first character of the word. ) 

( (Global) Return vara: ) 

( wordmerk : vector of response word markups, spelling and order markup combined. ) 



ERIC 



( 6 



Page 29 



procedure word_merkup; 



var 

r, m, p, lattfound: INTEGER; 
c: it ring [1 ] ; 
d: REAL; 

s, award: wordstr; 
notmissing: wordset; 

begin 

( Get the let of model positions which will MOT cause a missing word word markup 1 
{ if absent at the expected place in the response. If anyorderok is false, ) 
{ this will consist of ignorable words. If anyorderok is true, it will be ) 
t ignorable words plus those which appear somewhere in the sentence (i.e., are ) 
( actually matched), but are out of order. } 

if anyorderok then 
notmissing :- mignore + matched 

else 

notmissing mignore; 
lastfound :- 0; 
for r :- 1 to rig do 

( For eech word in the response, retrieve the matched position, m, in the model, ) 
( and the corresponding word, p (if there were synonym lists in the model, then ) 
( probably m <> p. ) 
begin 

m :- r_to_m[r] ; 

p :- aseq[r, m] ; 

{ If nothing matched to response word, and it is not en ignorable word, ) 
( and extra words are not permitted, then generate extra word markup. ) 
if r in rignore then 

wordmarkfr] :- 11 
else if b ■ 0 then 
if extrawordsok then 

wordmerk [r ] : - ' 1 
else 

wordmerkfr] :- dup_char (extrawd. Length (rw [r ] 1 ) 
( Otherwise, generate spelling markup. } 
else 
begin 

( If current word is e runtogether, restore space before generating spellmarks. ) 
if runtogether[r] > 0 then 

mword :- concat (mw(p] , space, mw [runtogether [ rj J ) 
else 

mword : - mw[p] ; 

( If misspelling is OK, make spelling markup blank, else generate it. ) 
if misspellOK then 

s dup_char (nomark, length (mword) ) 

else { Return full markup in S. } 

d :- nedit_dist (rw[r] , mword, s, marksP A , True, False); 
{ If the model position just matched skips ahead (right) of the last model ] 
| position by more than 1, then — unless every model word in between the two 1 
( positions is an ignorable word or unmarkable because it is merely out of order ) 
( and order is not being marked — some model words were left out at this point ) 
{ in the response, so generate a missing word markup to go just before this ) 
{ response word. } 

if (m > lestfound + 1) and not ( (lestfound f 1 . .m - 1J <- notmissing) then 

c !- missingwd 

else 

c :- * 1 ; 

{ If the model position matched appears in the model at a position to the left ) 
( of the last model position matched, then the response ordering inverts the model } 
( ordering at this point. Mark the matched response word as needing to be moved ) 
( left?*erd. Do not do this, however, if anyorderok is in effect. Preempt the } 
( first cheracter of the spelling markup to show the moveword symbol, unless it's a space, 
if *,m < lastfound) then 

if (not anyorderok) i (movewd <> nomark) then 

wordmark[r] :- concat (c, movewd. Copy ( s , 2, Length(s) - 1)) 

else 

wordJurk[r) :- concat (c, s) 

else 

begin 

wordmerk(r) :- concat (c, s); 
lastfound :- m 
end; 
end; 

end; { FOR } 

{ If final model position matched was not the rightmost position of model, ) 
( then some model words were left out at the end the response; mark missing * ) 
( words at end of response. } 

if not ([lastfound + 1 . . pig] <- mignore) then 
wordsurk[rlg + 1) :- missingwd 

else 

wordwark(rlg f 1] :- "; 
end; ( wordjearkup ) 



-Merk sentence ) 



Prepare a markup string which can be displayed beneath words of student's response. ) 
Task of this routine is to make sure each markup will be positioned boneath appropriate ) 
word and letters. ) 
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{ Input vera: } 

{ wordurk (Global) vector of markup strings, on* for each response word } 

function mark_sentence: at ring; 
var 

r, p: INTEGER; 

», outatr: string; 

begin 

{ Flrat char of markup ahould plot In poaltlon precaedlng first char of response | 
{ to provide place for • leading carat when Initial worda *r* mlaalng. ) 

{ 2 extra chara to provide for leading and trailing carats. ) 
out st r : ■ dup_char (noetark, 2 + length (response) ) ; 
for r :■ 1 to rig + 1 do 
f Ignorable word chars are appended to preceding real word aa If punctuation. ) 
If not (r In rlgnore) then 
begin 
n :- wordmark [r] ; 

{ Move ahead 1 char If no missing wd mark (remember extra char at front) 1 
If (*[1J » alaalngwd) then 

p rwxloc[rJ 
else 

p :« rwxloc(r] +1; 
{ Replace blanks et word location with markup string for the word. ) 
delete (outstr, p, length (m) ) ; 
Insert (m, outstr, p): 
end; 

mark_sentence :- out st r; 
end; { nark_aentence ) 

{ Checkorder } 

{ Top-level sentence checking procedure which executes the major sub-procedures } 
{ needed to create the sentence markup. ) 

procedure checkorder; 

var 

It INTEGER; 



US 



begin 

Initialize borders of spelling edit distance matrix — this Is never ) 
cleared, so could be done In markup XFCN Initialization If matrix were static. 
lnlt_spelllng(marksP*) ; 

Build matrix of normalized edit distances between all ( M, R ) word pairs. ) 
f lll_editd_matrlx (mw, rv, mwseq, a, aaeq) ; 

Build sets that record which model and response words have matches. For each } 
r word, build choicest r ] , a set of all possible matches for r. } 
llst__posslble_matches; 

Try to extend matching by looking for run-together worda among those so fat ) 
unmatched. ) 
If runtogether_needed and (not anyorderok) then 
flnd_runtogether; 

Apply exhaustive search algorithm to find a matching which minimizes number ) 
of Inversions . ) 
find best order; 



1 
i 
I 
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Generate strings for order and spelling markup, 
if word_raarkup_needed then 
word_markup; 

end; { checkorder ) 



1 



-Conpare ) 



Control structure for (default) full spelling and word order analysis. } 
Internalizes the two strings 'model* and * response* to prepare them for use 
in the order checking alqorithm, then runs that algorithm. } 

Output data structure (global): } 

wordmark : A vector of raerkup strings, 'wordmark 1 , with one entry } 

for eech word of the response. ) 

function compare (model, response: string) : string; 

begin 

judgedok :» False; 
setmodel (model ) ; 
setresponse (response) ; 
checkorder; 

75 
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judg«do)t :- ((pnatch«d - 1.0) or (extraworo'sok and (mreatched - til. .pig] - nlgnore)))) and ( (pnonlnversloni - 1.0) or anyordero*) 
and ((»v«diat - 0.0) or »lnp«lloK) ; 
conpara : - mark_a«nt«nc«; 

•nd; ( conpar* ) 

! ; 1 

I FORMAT OUTP0T TO HYPERCARD ) 

, 1 

( — — — ——————— — — — —-- FormatMarkOpHapa \ 

\ Re-format the R__ TO_M and MJTQ Jl mapa into a atring auitable for return ) 
( to HyperCard. The aapa are returned a* Hats of comma-separated Integers, ( 
( R — T0_M on line 1, and M_TO_R on line 2, of a two-line atring. ( 

function f ormetMarkupMaps: Str255; 

var 

1: INTEGER; 
a, t: Str255; 

begin 

a : - ' • ; 

for i :- 1 to rig do 
begin 

nureToStr (paramPtr, r — tojafl] , t); 
s :- Concat (a, t, ' , ' ) 

end; 

a[length(a)] :- chr(13); { substitute newline for dangling jomraa ) 

for i :- 1 to pig do 
begin 

iiunToStrtparanPtr, ra — to_r[i], t); 
s :- Concat (s, t, * , ' ) 
end; 

a[length(s)J :- chr(13); { subatitute newline for dangling comma ) 

for i :- 1 to rig do 
begin 

nu»ToStr(paraiiPtr, rvxloc ti] , t) ; 
s Concat ( a, t, ','). 
end; 

fornatMarkupHaps omit (a, length(s), 1); ( remove dangling comma 1 

end; { forma tMarkupMapa } 

{ Acceaa info requeated by PARAMDISPLAYNEEDED and put it into HyperCard global THEMARKUP PARAMD I SP UC£ . } 

procedure formatParamDisplay (switch: CHAR) ; 

var 

i: INTEGER; 

p, q: phon__va riant a; 

a: Str255;~ 

c, ch: CHAR; 

h: handle; 
hp: PTR; 

begin 

c : - ' , ' ; 
s : - ' ' ; 

h :- NevH*ndle(Q) i 

appendStringToHandle(h, Concat (•MUParamDiaplay switch, chr(13>)); 

case switch of 
•v' : 

appendStringToHandle (h, v«r. lonStr) ; 
•b' : 

for i :- 1 to 255 do 
beg in 
ch :- chr(i); 

if caae_infotchJ <> down_caae then 

appendStringToHandle (h, Concat (*, ch, '-' , baae_char[ch] , ',')1; 
end; 
•d' : 

for i :- 1 to 255 do 
if diacrit info[chrU>] <> no_accent then 
appendStringToHandle (h, Concat (s, CHR(i), NtoS {ORD (diacrit_inf o [chr (i) ]) ) , c) ) ; 

'c ' : 

for i :- 1 to 255 do 
if caae_inf o[chr (i) ) <> down_caae then 
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appandStringToHandlelh, Concat (s, CHRIi), c) ) ; 
•h' : 

for i 1 to 255 do 

eppaodStringToHandlelh, Concatla, CHR(l), NtoS (ORD (phon info[ehr (i.) 111. c) ) ; 

•P' : 

for i :- 1 to 255 do 
if chr(l) In delim_chara then 
appandStringToHandlelh, Concat (a, CHR(i))); 



appandStringToHandlelh, Concat (NtoS (winsert ) , c, NtoS (wdelete) , c, NtoS {vchinjil , c, NtoS (wtranapose) , c, UtoS(wcap), c, 
NtoS (waccant) , c) ) ; 

appandStringToHandlelh, Concat (EtoS (prop errors) , c, EtoSlcut of f ) , c, EtoS (runon criterion ))) ; 
end; 
•f : 



case OHD(cap_f leg) of 

1: 

2: 

a :- 

3: 

a :- 

end; I CASE ) 

appandStringToHandlelh, Concatla, c, BtoS (anyOrderOk) , c, BtoS (extratfordaOk) , c, BtoS (miaspellOk) , c)); 

AppandStringToHandlelh, Concat (BtoS (word_marJcup_needed) , c, BtoS (runt ogether_need«d) , c, BtoS (adjuat_needod) , c, BtoS (shortcut ) , 
appandStringToHandlelh, Concat IparamDisplayNeeded, c, rawTrace, c, BtoS (trace) ) ) ; 



c) ); 



ringToHandlelh, Concat (addcap, dropcap, accenterr, extrawd, misaingwd, raovewd, extraltr, raisaingltr, substitute] tr, 
transltrl, tranaltr2, runonwd) ) ; 
end; 
■a' : 

for p vowel to phon5 do 
for q vowel to phon 5 do 
appendStringToHendle (h, concat (a, NtoS (phon_matrix(p, q) ) , c)); 
otherwi sa 

FAIL ( ' % Inval id info type. Use: Veralon, Base, Dlacrit, Cap, Punct, pHon, Weights, Flags, markupSymbola, phon_Matrix' ) ; 
end; ( CASE switch ) 

I Add null char terminator for HyperCard atrlng ) 
i :- GetHandleSize(h) ; 

hp :- PTRIORDlh") ♦ i - 1); ( Ptr to last byte of block. ) 
if (i > 0) I (hp* - 0RDC,')l than 
hp" !— 0 { Replace trailing comma with null char terminator. ) 



appandStringToHandlelh, CHR(O)); ( Append null 

( Assign handle to global. ) 

aetGlobal (pararePt r, 't heMar kupParamDlspl ay ' , h) ; 
dispoaaHandle (h) ,* 

end; { f ormatParatiDiaplay I 



| TOP-LEVEL CONTROL STRUCTURE FOR MARKUP XFCN. i 



| MarkUpl 

( Top-level controlling procedure which unpacks HyperCard parameter values, } 
{ generatea Markup and seta up values for return to HyperCard. } 

procedure markup (parajsPtr: XCmoPtr) ; 



h: handle; 

p: PTR; 



I 



GetStringParan ) 



( Converta the PARMNUH-th XFCN input parameter to a string. ) 
function getStringParam (parmHum: INTEGER) : Str255; 

var 

s: Str255; 

begin 

if (paruraPtr* .paramCount < parmNuio) then 
getStringParam :- 
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e 



•la* 

begin 

**- oToPaa (paranPtr, paramPtr* ,paraj«s[par»Nujil *, a) ; 
gatStrlngPara* : " a; 
and 

and; { get String Par a« 1 

. GetCharParaia ) 

Converts tha PARMNUM-th XFCN Input parameter to an integer. } 
function getCherParam (parmNuji: INTEGER; default: CHAR): CHAR; 

var 

i: Str2 55; 

begin 

if (paramPtr* .parareCount < parraNwn) than 

gatCherPeraia : - default 
alae 

bag in. 

ZaroToPaa (paramPtr, pararaPtr* . pararas [parmNura] * , s) ; 
if length (a) < 1 then 

getCharParam : - default 
elae 

getCherParam :» s[l); 
and 



and; ( getCharParani ) 



-GetBoolaanParara } 



1 Converta tha PARMNUM-th XFCN input parameter to a boolean value. ) 
{ If the parameter is empty, than DEFAULT la assigned as the value. ) 

function gatBooleanParam (parraNum: INTEGER; default: BOOLEAN) : BOOLEAN; 

var 
s: Str255; 

begin 

a :- gatStringPara»{parmNura) ; 
if a - * ' then 

gatBoolaanParaia : - default 
alae 

begin 

getBooleanParara at rToBool (pararaPtr, *) ; 
if paramPtr^.reault <> noErr then 
FAIL (concat ( ' *Bad boolean input parara value', a)) 
and; 



end; { getBooleanParara 1 



-GetCapParais } 



{ Converta tha PARHNUW-th XFCN input parameter to a cap variant value. } 
{ If the parameter la empty, then DEFAULT la used. } 

function getCapParan (paramNum: INTEGER; default: cap_flag_type) : cap_f lag_type; 

var 
a: Str2 55; 



begin 

a :- getStringParam (paramNum) ; 
if s - • • then 

getCapParan : » default 
alae if eq (a, • e*act_caae' ) then 

getCapParan :- axact^case 
alae if eq(a, • authors_caps • ) then 

getCapParan :- authora_capa 
elae if eqta, ' lgnore^cese' ) then 

getCepParam :» ignore_ca aa 
al ae 

FAIL (concat ( ' tBad cap_flag input param value: 
end; { getCapParan 1 

begin { marJtUp ) 

p :- nil; I Cuz FAIL operates on P. 1 
1 Check input parameter syntax. 1 
1 f (pararaPtr* . parwnCount ■ 0) then 
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FAIL(varsionStr) ; 

if (paramPtr*. paramCount < 2) than 
FAIL( 1 %MODEL and RESPONSE parameters required*) ; 

I Clear debug global. I 

returnlnGlobal (' theMarkOpDebug', " ) ; 

( Gat memory from Mac heap. ) 
p :- MewPtr (sizeOf (lsmatrlx)); 
If p - nil than 

FAIL(*%Couldn' 't gat matrix namory. 1 ) 
alaa 

marksP :- LSMATRIXPTR (p) ; 

{ Unpack the input parameter* and format them as Paical variable*. ) 

{ Default letting* are used if no parameter or empty parameter value. } 

model get St ringP a ram ( 1 ) ; 
response : ■ getStringPara* (2) ; 
cap_flag :- getCapParam (3, exact_case) ; 
extraWordsOK :- gatBooleanParam (4, FALSE); 
anyOrdarOk getBooleanPsrara (5, FALSE); 
misspellOK : - getBooleanParam(<, FALSE); 
word__Earkup_needed : - getBooleanPeram (7, TRDE) ; 
runtogether__needed : - getBooleanParara (8, TRDE) ; 
adjust__needed getBooleanPeram (9, TRUE); 
shortcut :- gatBooleanParara (10, TRUE) ; 
mar kupMaps Needed :- getBooleanPararatll, FALSE) ; 
paramDisplayNeeded : - getCharPar am (12 , *x'); 
rawTrace :- getCharParara (13, " x * ) f 
trace :- getBooleanParam (14, FALSE) ; 

( Initializes all static data structures, including the char info tables, punct table, } 
[ markup symbol table, phon_matrix, weights and threshold values. } 
init_markup; 

[ Format and return markup parameter display vis global variable ' thaMarkupParamDi splay • . ) 
if paramDisplayNeeded <> 'x' then 
formatParamDi splay (paramOisplayKeeded) ; 

I Do all the markup work here; return the markup symbol string as the value of the XFCN. ) 

if rawTrace * 'x* then 
( If requested full (defsult) spelling and word order analysis. } 
begin 

( Compute markup string as direct return. } 

paramPtr* .ReturnVelue PasToZero (paramPtr , compare (modal* response) ) ; 

( Format and return judging information via global variable "theHarkUpReturnValues" . ) 

returnlnClobal ( ' theMarkUpReturnValues • , concat (BtoS ( judgedOk) . c, EtoS (pMatched) , c, EtoS (pNonlnveroions) , c, EtoS (aveDlst) ) ) ; 

{ Format and return matching map information via global variable 'theMarkupMapa' ) 
if markupMapsNeeded then 
begin 

h i- pesToZero (paramPtr, fornatMarkupKapa ) ; 
If h - nil then 

FAIL ( *%Out of memory while formating markup maps.') 

else 

begin 

setGlobal (paramPtr, 'theMarkupMapa 1 , h) ; 
disposHandle (h) 
end 
end 
end 
else 

( If pure lee st-edit-t race analysis on input strings was requested, 1 
{ then generate edit distances and edit trace on raw input strings. } 
begin 

edit_trece (model, response) ; 
end; 

( Get rid of dynamic memory. ) 
disposPtr (p) ; 

{ Don't pess the MARKUP message up HyperCard's inheritance structure. } 
paramPtr* .PassFlag :- FALSE; 

end; ( markup ) 

{ , 

( MAIN ) 



begin { main ) 
markup (pa r&mPtr) ; 

rnTr 
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and; ( uin ) 
•nd. { unit MrkupXFCN I 



I 
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